{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "wZ2FFdxgTVfQ"
   },
   "source": [
    "# Lecture 4: Introduction to Probability Theory (Part II)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "hXIJBJQ-TVfU"
   },
   "source": [
    "## Objectives\n",
    "\n",
    "+ To introduce the concept of a probability space.\n",
    "+ To introduce the concept of discrete random variables.\n",
    "+ To introduce the probability mass function and the cumulative distribution function.\n",
    "+ To learn about the expectation and variance of a random variable.\n",
    "+ To introduce joint probability mass function.\n",
    "+ To learn how one can condition a random variable on observations of another.\n",
    "+ To introduce the concept of independent random variables.\n",
    "+ To learn about some basic discrete random variables."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "JaTfOtFSTVfX"
   },
   "source": [
    "## Readings\n",
    "\n",
    "+ These notes.\n",
    "+ The wikipedia page for the [Bernoulli distribution](https://en.wikipedia.org/wiki/Bernoulli_distribution).\n",
    "+ The wikipedia page for the [Binomial distribution](https://en.wikipedia.org/wiki/Binomial_distribution).\n",
    "+ The wikipedia page for the [Poisson distribution](https://en.wikipedia.org/wiki/Poisson_distribution)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "s-c-nOInTVfY"
   },
   "outputs": [],
   "source": [
    "%matplotlib inline\n",
    "import matplotlib.pyplot as plt\n",
    "import seaborn as sns\n",
    "sns.set_context('talk')\n",
    "import numpy as np"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "v9dswPH5TVfh"
   },
   "source": [
    "## Probability spaces\n",
    "\n",
    "In whatever we write below, everything is conditioned on our current information $I$.\n",
    "Because this information is always on the background, we will not be explictily showing it in our notation.\n",
    "\n",
    "Assume that we are doing an experiment.\n",
    "It doesn't matter what exactly the experiment is.\n",
    "The result of the experiment depends on the values of some physical variables $\\omega$ which may be unknown to us (epistemic uncertainty) or truly random (aleatory uncertainty).\n",
    "In the language of mathematical probability theory, this $\\omega$ is called an **event**.\n",
    "The space of all possible $\\omega$'s, denoted by $\\Omega$, is called the **event space**.\n",
    "For today, assume that $\\Omega$ is a discrete space (otherwise things become a little bit more complicated).\n",
    "\n",
    "Since, we are uncertain about which $\\omega$ will appear in nature, we need to assign probabilities over the possible values.\n",
    "Ideally, what we would like to have is some function $\\mathbb{P}(A)$ that takes an arbitrary subset $A$ of $\\Omega$ and tells us how probable it is.\n",
    "That is $\\mathbb{P}$ is a function from all subsets of $\\Omega$, $\\mathcal{F} = \\mathcal{P}(\\Omega)$, to the real numbers:\n",
    "$$\n",
    "\\mathbb{P}: \\mathcal{F} \\rightarrow \\mathbb{R},\n",
    "$$\n",
    "There are a few things that this function should satisfy for all $A$ in $\\mathcal{F}$\n",
    "+ It should be nonnegative, i.e., $\\mathbb{P}(A)\\ge 0$.\n",
    "+ One of the $\\omega$'s must happen, $\\mathbb{P}(\\Omega) = 1$.\n",
    "+ The obvious rule $\\mathbb{P}(A^c) = 1 - \\mathbb{P}(A)$, where $A^c = \\Omega\\setminus A$ is the complement of $A$.\n",
    "When these properties are satisfied, we say that $\\mathbb{P}$ is a probability measure on $\\mathcal{F}$.\n",
    "\n",
    "The triplet $(\\Omega, \\mathcal{F}, \\mathbb{P})$ is called a **probability space**.\n",
    "\n",
    "Note: If we wanted to show the background information we would be writting $\\mathbb{P}[A|I]$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "j9vfYQPWTVfj"
   },
   "source": [
    "## The mathematical definition of a random variable\n",
    "\n",
    "Now assume that we are doing a specific experiment that measures something, say an integer.\n",
    "Assume that the physical variables that determine what is the result of the experiment are $\\omega$ and they take values in a set $\\Omega$.\n",
    "We are uncertain about the $\\omega$'s and we have described this uncertainty using a probability measure $\\mathbb{P}$ on some subsets $\\mathcal{F}$ of $\\Omega$.\n",
    "Call $X$ the result of the experiment.\n",
    "The graph is as follows:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 177
    },
    "colab_type": "code",
    "executionInfo": {
     "elapsed": 580,
     "status": "ok",
     "timestamp": 1579714221366,
     "user": {
      "displayName": "Rohit Tripathy",
      "photoUrl": "https://lh3.googleusercontent.com/a-/AAuE7mBph_bUyWEn5XOlEFll2rKHIRUAoRasanRpNxpmgwo=s64",
      "userId": "06977202960713719461"
     },
     "user_tz": 300
    },
    "id": "mR3hWfBlTVfl",
    "outputId": "23a2a23e-6117-4429-ecd9-63175a1eb434"
   },
   "outputs": [
    {
     "data": {
      "image/svg+xml": [
       "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"no\"?>\n",
       "<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
       " \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
       "<!-- Generated by graphviz version 2.43.0 (0)\n",
       " -->\n",
       "<!-- Title: omega_X Pages: 1 -->\n",
       "<svg width=\"62pt\" height=\"116pt\"\n",
       " viewBox=\"0.00 0.00 62.00 116.00\" xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\">\n",
       "<g id=\"graph0\" class=\"graph\" transform=\"scale(1 1) rotate(0) translate(4 112)\">\n",
       "<title>omega_X</title>\n",
       "<polygon fill=\"white\" stroke=\"transparent\" points=\"-4,4 -4,-112 58,-112 58,4 -4,4\"/>\n",
       "<!-- omega -->\n",
       "<g id=\"node1\" class=\"node\">\n",
       "<title>omega</title>\n",
       "<ellipse fill=\"none\" stroke=\"black\" cx=\"27\" cy=\"-90\" rx=\"27\" ry=\"18\"/>\n",
       "<text text-anchor=\"start\" x=\"21.94\" y=\"-85.8\" font-family=\"Times,serif\" font-size=\"14.00\">ω</text>\n",
       "</g>\n",
       "<!-- X -->\n",
       "<g id=\"node2\" class=\"node\">\n",
       "<title>X</title>\n",
       "<ellipse fill=\"none\" stroke=\"black\" cx=\"27\" cy=\"-18\" rx=\"27\" ry=\"18\"/>\n",
       "<text text-anchor=\"middle\" x=\"27\" y=\"-13.8\" font-family=\"Times,serif\" font-size=\"14.00\">X</text>\n",
       "</g>\n",
       "<!-- omega&#45;&gt;X -->\n",
       "<g id=\"edge1\" class=\"edge\">\n",
       "<title>omega&#45;&gt;X</title>\n",
       "<path fill=\"none\" stroke=\"black\" d=\"M27,-71.7C27,-63.98 27,-54.71 27,-46.11\"/>\n",
       "<polygon fill=\"black\" stroke=\"black\" points=\"30.5,-46.1 27,-36.1 23.5,-46.1 30.5,-46.1\"/>\n",
       "</g>\n",
       "</g>\n",
       "</svg>\n"
      ],
      "text/plain": [
       "<graphviz.dot.Digraph at 0x109a84ed0>"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from graphviz import Digraph\n",
    "g = Digraph('omega_X')\n",
    "g.node('omega', label='<&omega;>')\n",
    "g.node('X')\n",
    "g.edge('omega', 'X')\n",
    "g.render('omega_X', format='png')\n",
    "g"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "2RUXuSc1TVfr"
   },
   "source": [
    "This brings us to the mathematical definition of a random variable:\n",
    "\n",
    "> A random variable is a function of the event space $X(\\omega)$.\n",
    "\n",
    "Note: if the event space is not discrete, we need some more restrictions on these functions.\n",
    "You need to take a probability theory course to learn about the technical details.\n",
    "\n",
    "Now, if $X(\\omega)$ takes discrete values, like heads or tails, $0, 1, 2$, etc., then we say that $X$ is a discrete random variable.\n",
    "If $X(\\omega)$ takes continuous values, like real numbers, then we say that $X$ is a continuous random variables.\n",
    "Today, we are only going to work with discrete random variables.\n",
    "\n",
    "Notation:\n",
    "+ We will be using upper case letters to represent random variables, like $X, Y, Z$.\n",
    "+ We will be using lower case letters to represent the values of random variables, like $x, y, z$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "WqDInTPdTVfs"
   },
   "source": [
    "### Example: The random variable corresponding to the result of a coin toss (1/2)\n",
    "\n",
    "Let's consider again the coin tossing example we introduced in the previous lecture.\n",
    "\n",
    "![Coin flipping](coin_flipping.png)\n",
    "\n",
    "Remember that we denoted with $v_0$ and $\\omega_0$ the initial velocity and angular velocity of the coin.\n",
    "Then, we showed that the variable $X$ representing the coin toss can be predicted exactly, if we knew $v_0$ and $\\omega_0$.\n",
    "Specifically, we derived the following relationship between the result of the coin toss and the initial conditions:\n",
    "$$\n",
    "X = \n",
    "\\begin{cases}\n",
    "T,&\\;\\text{if}\\;\\frac{2v_0\\omega_0}{g} (\\text{mod}\\;2\\pi) \\in \\left(\\frac{\\pi}{2},\\frac{3\\pi}{2}\\right),\\\\\n",
    "H,&\\;\\text{otherwise}.\n",
    "\\end{cases}\n",
    "$$\n",
    "Graphically, this relationship can be represented by:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 177
    },
    "colab_type": "code",
    "executionInfo": {
     "elapsed": 266,
     "status": "ok",
     "timestamp": 1579714221525,
     "user": {
      "displayName": "Rohit Tripathy",
      "photoUrl": "https://lh3.googleusercontent.com/a-/AAuE7mBph_bUyWEn5XOlEFll2rKHIRUAoRasanRpNxpmgwo=s64",
      "userId": "06977202960713719461"
     },
     "user_tz": 300
    },
    "id": "7JqiOr1sTVfv",
    "outputId": "53c79d75-c1d8-4f07-e180-aab9a58e8148"
   },
   "outputs": [
    {
     "data": {
      "image/svg+xml": [
       "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"no\"?>\n",
       "<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
       " \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
       "<!-- Generated by graphviz version 2.43.0 (0)\n",
       " -->\n",
       "<!-- Title: coin_toss_g Pages: 1 -->\n",
       "<svg width=\"206pt\" height=\"116pt\"\n",
       " viewBox=\"0.00 0.00 206.00 116.00\" xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\">\n",
       "<g id=\"graph0\" class=\"graph\" transform=\"scale(1 1) rotate(0) translate(4 112)\">\n",
       "<title>coin_toss_g</title>\n",
       "<polygon fill=\"white\" stroke=\"transparent\" points=\"-4,4 -4,-112 202,-112 202,4 -4,4\"/>\n",
       "<!-- omega0 -->\n",
       "<g id=\"node1\" class=\"node\">\n",
       "<title>omega0</title>\n",
       "<ellipse fill=\"none\" stroke=\"black\" cx=\"27\" cy=\"-90\" rx=\"27\" ry=\"18\"/>\n",
       "<text text-anchor=\"start\" x=\"18.44\" y=\"-86.8\" font-family=\"Times,serif\" font-size=\"14.00\">ω</text>\n",
       "<text text-anchor=\"start\" x=\"28.56\" y=\"-86.8\" font-family=\"Times,serif\" baseline-shift=\"sub\" font-size=\"14.00\">0</text>\n",
       "</g>\n",
       "<!-- X -->\n",
       "<g id=\"node4\" class=\"node\">\n",
       "<title>X</title>\n",
       "<ellipse fill=\"none\" stroke=\"black\" cx=\"99\" cy=\"-18\" rx=\"27\" ry=\"18\"/>\n",
       "<text text-anchor=\"middle\" x=\"99\" y=\"-13.8\" font-family=\"Times,serif\" font-size=\"14.00\">X</text>\n",
       "</g>\n",
       "<!-- omega0&#45;&gt;X -->\n",
       "<g id=\"edge3\" class=\"edge\">\n",
       "<title>omega0&#45;&gt;X</title>\n",
       "<path fill=\"none\" stroke=\"black\" d=\"M41.57,-74.83C51.75,-64.94 65.52,-51.55 77.03,-40.36\"/>\n",
       "<polygon fill=\"black\" stroke=\"black\" points=\"79.47,-42.87 84.2,-33.38 74.59,-37.85 79.47,-42.87\"/>\n",
       "</g>\n",
       "<!-- v0 -->\n",
       "<g id=\"node2\" class=\"node\">\n",
       "<title>v0</title>\n",
       "<ellipse fill=\"none\" stroke=\"black\" cx=\"99\" cy=\"-90\" rx=\"27\" ry=\"18\"/>\n",
       "<text text-anchor=\"start\" x=\"92\" y=\"-86.8\" font-family=\"Times,serif\" font-size=\"14.00\">v</text>\n",
       "<text text-anchor=\"start\" x=\"99\" y=\"-86.8\" font-family=\"Times,serif\" baseline-shift=\"sub\" font-size=\"14.00\">0</text>\n",
       "</g>\n",
       "<!-- v0&#45;&gt;X -->\n",
       "<g id=\"edge2\" class=\"edge\">\n",
       "<title>v0&#45;&gt;X</title>\n",
       "<path fill=\"none\" stroke=\"black\" d=\"M99,-71.7C99,-63.98 99,-54.71 99,-46.11\"/>\n",
       "<polygon fill=\"black\" stroke=\"black\" points=\"102.5,-46.1 99,-36.1 95.5,-46.1 102.5,-46.1\"/>\n",
       "</g>\n",
       "<!-- g -->\n",
       "<g id=\"node3\" class=\"node\">\n",
       "<title>g</title>\n",
       "<ellipse fill=\"lightgrey\" stroke=\"black\" cx=\"171\" cy=\"-90\" rx=\"27\" ry=\"18\"/>\n",
       "<text text-anchor=\"middle\" x=\"171\" y=\"-85.8\" font-family=\"Times,serif\" font-size=\"14.00\">g</text>\n",
       "</g>\n",
       "<!-- g&#45;&gt;X -->\n",
       "<g id=\"edge1\" class=\"edge\">\n",
       "<title>g&#45;&gt;X</title>\n",
       "<path fill=\"none\" stroke=\"black\" d=\"M156.43,-74.83C146.25,-64.94 132.48,-51.55 120.97,-40.36\"/>\n",
       "<polygon fill=\"black\" stroke=\"black\" points=\"123.41,-37.85 113.8,-33.38 118.53,-42.87 123.41,-37.85\"/>\n",
       "</g>\n",
       "</g>\n",
       "</svg>\n"
      ],
      "text/plain": [
       "<graphviz.dot.Digraph at 0x1a1da32450>"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from graphviz import Digraph\n",
    "gct = Digraph('coin_toss_g')\n",
    "gct.node('omega0', label='<&omega;<sub>0</sub>>')\n",
    "gct.node('v0', label='<v<sub>0</sub>>')\n",
    "gct.node('g', style='filled')\n",
    "gct.node('X')\n",
    "gct.edge('g', 'X')\n",
    "gct.edge('v0', 'X')\n",
    "gct.edge('omega0', 'X')\n",
    "gct.render('coin_toss_g', format='png')\n",
    "gct"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "fAk-O8v3TVf0"
   },
   "source": [
    "Then, we argued that the uncertainty about the value of $X$ is induced by our uncertainty about the values of $v_0$ and $\\omega_0$.\n",
    "It is not that the coin toss is random.\n",
    "It is described in extreme detail by Newton's laws.\n",
    "It is that we do not know what the initial conditions are.\n",
    "So, the state of nature is captured by $(v_0,\\omega_0)$.\n",
    "Notice that essentially the variable $X$ is a function of $(v_0,\\omega_0)$.\n",
    "We can write:\n",
    "$$\n",
    "X = X(v_0, \\omega_0).\n",
    "$$\n",
    "You see that the result of the coin toss $X$ is nothing more but a function of the *true state of nature* $(v_0, \\omega_0)$.\n",
    "It is just that the value of $X$ is uncertain because the state of nature is uncertain.\n",
    "$X$ is an example of a random variable."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "kM7RpF6kTVf2"
   },
   "source": [
    "## The probability mass function\n",
    "\n",
    "Take a discrete random variable $X$ defined on some probability space $(\\Omega, \\mathcal{F}, \\mathbb{P})$.\n",
    "Without loss of generality, assume that $X$ can potentially take infinite $\\mathbb{N} = \\{1,2,\\dots\\}$.\n",
    "Why is this sufficient?\n",
    "\n",
    "+ If $X$ takes finite values then we can simply set the probability of values after a given number equal to zero.\n",
    "+ If the values are of another type (e.g., heads and tails) you can just map them to the natural numbers.\n",
    "\n",
    "The probability mass function of the random variable $X$, denoted by $f_X(x)$, gives the probability of $X$ taking the value $x$.\n",
    "Mathematically, it is defined by:\n",
    "$$\n",
    "f_X(x) := \\mathbb{P}(X=x) = \\mathbb{P}\\left(\\{\\omega: X(\\omega) = x\\}\\right).\n",
    "$$\n",
    "Notice that we are just gathering in a set all the states of nature $\\omega$ that give an experiment with value $x$, $X=k$, and then we find probability of that set.\n",
    "\n",
    "If you are 100\\% sure about which random variable you are talking about,\n",
    "feel free to use the much simpler notation:\n",
    "$$\n",
    "p(x) \\equiv p(X=x) \\equiv f_X(x) = \\mathbb{P}\\left(\\{\\omega: X(\\omega) = x\\}\\right).\n",
    "$$\n",
    "This is the notation we will employ from this point on.\n",
    "We will only use the strict mathematical notation when we have no choice.\n",
    "\n",
    "Note: If we wanted to show the background information we would be writing $p(x|I)$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "KT-Ws1wQTVf3"
   },
   "source": [
    "### Properties of the probability mass function\n",
    "\n",
    "There are some standard properties of the probability mass function that is worth memorizing:\n",
    "\n",
    "+ The probability mass function is nonnegative:\n",
    "$$\n",
    "p(x) \\ge 0,\n",
    "$$\n",
    "for all $x$ in $\\mathbb{N}$.\n",
    "+ The probability mass function is normalized:\n",
    "$$\n",
    "\\sum_{x=0}^\\infty p(x) = 1.\n",
    "$$\n",
    "This is a direct consequence of the fact that $X$ must take a value.\n",
    "+ Take any set of possible values of $X$, $A$. The probability of $X$ taking values in $A$ is:\n",
    "$$\n",
    "p(X\\in A) = \\sum_{x\\in A} p(x).\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "gfQBfhcoTVf4"
   },
   "source": [
    "### Example: The random variable corresponding to the result of a coin toss (2/2)\n",
    "\n",
    "Let's write down the probability mass function of the coin toss random variable $X$.\n",
    "Without loss of generality, we can map heads to the number $0$ and tails to the number $1$.\n",
    "We need to specify the probability of one of these events, as the probability of the other one is trivially defined.\n",
    "For a fair coin we have:\n",
    "$$\n",
    "p(X=0) = \\text{probability of heads} = \\frac{1}{2}.\n",
    "$$\n",
    "From this, because of the normalization constraint:\n",
    "$$\n",
    "p(X=0) + p(X=1) = 1,\n",
    "$$\n",
    "we get that:\n",
    "$$\n",
    "p(X=1) = \\frac{1}{2}.\n",
    "$$\n",
    "This is an example of a special random variable taking two discrete values $0$ and $1$, which we call the Bernoulli random variable.\n",
    "We will see it in an example later on."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "caB-hoczTVf5"
   },
   "source": [
    "## Functions of discrete random variables\n",
    "\n",
    "Consider a random variable $X$ taking values in $\\mathbb{N}$ with probability mass function $p(x)$.\n",
    "Now, consider a function $g(x)$.\n",
    "We can now define a new random variable:\n",
    "$$\n",
    "Y = g(X).\n",
    "$$\n",
    "The this random variable takes values in:\n",
    "$$\n",
    "g(\\mathbb{N}) := \\{g(x): x \\in \\mathbb{N}\\}.\n",
    "$$\n",
    "It has its own probability mass function (pmf) which we can define using the pmf of $X$:\n",
    "$$\n",
    "p(y) = p(Y = y) = p(X\\in g^{-1}(y)) = \\sum_{x\\in g^{-1}(y)} p(x),\n",
    "$$\n",
    "where $g^{-1}(y)$ is the set of $x$'s that map to $y$ through $g$, i.e.,\n",
    "$$\n",
    "g^{-1}(y) := \\{x\\in\\mathbb{N}: g(x) = y\\}.\n",
    "$$\n",
    "\n",
    "This is formal definition of the uncertainty propagation problem.\n",
    "The correspondence is that $X$ represents the parameters of a physical model, and $Y = g(X)$ is the uncertain result of the physical model."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "JAAfYbcTTVf5"
   },
   "source": [
    "## Expectation of random variables\n",
    "\n",
    "The expectation of a random variable is defined to be:\n",
    "$$\n",
    "\\mathbb{E}[X] = \\sum_{x=0}^\\infty x p(x).\n",
    "$$\n",
    "You can think of the expectation as the value of the random variable that one should \"expect\" to get.\n",
    "However, take this interpretation with a grain of salt because it may be a value that the random variable has a zero probability of getting...\n",
    "\n",
    "### Example: Expectation of a coin toss\n",
    "The expectation of the coin toss random variable is:\n",
    "$$\n",
    "\\mathbb{E}[X] = 0\\cdot p(X=0) + 1\\cdot p(X=1) = 0.5.\n",
    "$$\n",
    "Of course, this is not a value that the random variable can get.\n",
    "\n",
    "### Properties of the expectation\n",
    "Here are some properties of the expectation.\n",
    "The proof of some of these properties will be given as homework.\n",
    "+ Take any constant $c$. Then we have:\n",
    "$$\n",
    "\\mathbb{E}[X + c] = \\mathbb{E}[X] + c.\n",
    "$$\n",
    "+ For any $\\lambda$ real number, we also have:\n",
    "$$\n",
    "\\mathbb{E}[\\lambda X] = \\lambda \\mathbb{E}[X].\n",
    "$$\n",
    "+ Take two random variables $X$ and $Y$. Then we have:\n",
    "$$\n",
    "\\mathbb{E}[X + Y] = \\mathbb{E}[X] + \\mathbb{E}[Y].\n",
    "$$\n",
    "+ Now consider any function $g(x)$.\n",
    "We can now define the expectation of $g(X)$ as the expectation of the random variable $Y = g(X)$.\n",
    "It is quite easy to show that:\n",
    "$$\n",
    "\\mathbb{E}[g(X)] = \\sum_{x=0}^\\infty g(x) p(x).\n",
    "$$\n",
    "+ Assume that $g(x)$ is a convex function, then:\n",
    "$$\n",
    "g\\left(\\mathbb{E}[X]\\right) \\le \\mathbb{E}[g(X)].\n",
    "$$\n",
    "This is known as Jensen's inequality."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "GjiB-vsCTVf7"
   },
   "source": [
    "## Variance of random variables\n",
    "\n",
    "The variance of a random variable $X$ is defined as the expectation of the square deviation from its expectation, i.e.:\n",
    "$$\n",
    "\\mathbb{V}[X] := \\mathbb{E}\\left[(X - \\mathbb{E}[X])^2\\right].\n",
    "$$\n",
    "You can think of the variance as the spread of the random variable around its expectation.\n",
    "However, do not take this too literally for discrete random variables.\n",
    "\n",
    "### Example: Variance of a coin toss\n",
    "Let's calculate the variance of the coin toss.\n",
    "We need:\n",
    "$$\n",
    "\\mathbb{E}\\left[X^2\\right] = 0^2\\cdot p(X=0) + 1^2 \\cdot p(X=1) = 0.5.\n",
    "$$\n",
    "So, using the formula above we get:\n",
    "$$\n",
    "\\mathbb{V}[X] = \\mathbb{E}\\left[X^2\\right] - \\left(\\mathbb{E}[X]\\right)^2 = 0.5 - (0.5)^2 = 0.5 - 0.25 = 0.25.\n",
    "$$\n",
    "\n",
    "### Properties of the variance\n",
    "Here are some properties of the variance.\n",
    "+ It holds that:\n",
    "$$\n",
    "\\mathbb{V}[X] = \\mathbb{E}\\left[X^2\\right] - \\left(\\mathbb{E}[X]\\right)^2.\n",
    "$$\n",
    "+ For any constant $c$, we have:\n",
    "$$\n",
    "\\mathbb{V}[X + c] = \\mathbb{V}[X].\n",
    "$$\n",
    "+ For any constant $c$, we have:\n",
    "$$\n",
    "\\mathbb{V}[cX] = c^2\\mathbb{V}[X].\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "D7W7N1gdTVf9"
   },
   "source": [
    "## Example: The Bernoulli random variable (1/2)\n",
    "\n",
    "Bernoulli random variable generalizes the concept of a coin toss.\n",
    "You can think of it as the result of an experiment with two possible outcomes $0$ and $1$.\n",
    "One just needs to specify the probability of one of the outcomes, typically the probability of zero.\n",
    "So, how do we denote mathematically a Bernoulli random variable $X$ that takes the value $1$ with probability $\\theta$ in $[0,1]$?\n",
    "We can write:\n",
    "$$\n",
    "X = \\begin{cases} 1,\\;\\text{with probability}\\;\\theta,\\\\\n",
    "0,\\;\\text{otherwise}.\n",
    "\\end{cases}\n",
    "$$\n",
    "Notice that in defining this random variable we are ignoring the mechanism that is giving rise to it.\n",
    "This is ok. It just means that we have decided to not look into it.\n",
    "The other way we can write this is as follows:\n",
    "$$\n",
    "X \\sim \\operatorname{Bernoulli}(\\theta).\n",
    "$$\n",
    "Let's use the functionality of ``scipy.stats`` to define a Bernoulli random variable and sample from it.\n",
    "\n",
    "The expectation of the Bernoulli is:\n",
    "$$\n",
    "\\mathbb{E}[X] = \\sum_x x p(X=x) = 0\\cdot (1-\\theta) + 1\\cdot \\theta = \\theta.\n",
    "$$\n",
    "Similarly, the variance of the Bernoulli is:\n",
    "$$\n",
    "\\mathbb{V}[X] = \\mathbb{E}[X^2] - \\left(\\mathbb{E}[X]\\right)^2 = \\theta - \\theta^2 = \\theta(1-\\theta).\n",
    "$$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "mLGI6IQgTVf_"
   },
   "outputs": [],
   "source": [
    "# Import the scipy.stats library\n",
    "import scipy.stats as st\n",
    "# This is the probability of 1:\n",
    "theta = 0.6\n",
    "# Define the random variable, Bernoulli(theta)\n",
    "X = st.bernoulli(theta)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 34
    },
    "colab_type": "code",
    "executionInfo": {
     "elapsed": 389,
     "status": "ok",
     "timestamp": 1579714226556,
     "user": {
      "displayName": "Rohit Tripathy",
      "photoUrl": "https://lh3.googleusercontent.com/a-/AAuE7mBph_bUyWEn5XOlEFll2rKHIRUAoRasanRpNxpmgwo=s64",
      "userId": "06977202960713719461"
     },
     "user_tz": 300
    },
    "id": "WL4mA7J3TVgC",
    "outputId": "74ea4104-40a1-4bc0-ed73-b88a18c14bcf"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "X takes values in (0, 1)\n"
     ]
    }
   ],
   "source": [
    "# Here is the **support** of the random variable. It tells you which variables it takes:\n",
    "print('X takes values in', X.support())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 52
    },
    "colab_type": "code",
    "executionInfo": {
     "elapsed": 414,
     "status": "ok",
     "timestamp": 1579714226747,
     "user": {
      "displayName": "Rohit Tripathy",
      "photoUrl": "https://lh3.googleusercontent.com/a-/AAuE7mBph_bUyWEn5XOlEFll2rKHIRUAoRasanRpNxpmgwo=s64",
      "userId": "06977202960713719461"
     },
     "user_tz": 300
    },
    "id": "xM_Zzal6TVgG",
    "outputId": "ab5a24ce-5cca-483a-ffa7-0cf6d60f9056"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "p(X=0) = 0.40\n",
      "p(X=1) = 0.60\n"
     ]
    }
   ],
   "source": [
    "# Evaluate the probability mass function at every point of the support\n",
    "for x in X.support():\n",
    "    print('p(X={0:d}) = {1:1.2f}'.format(x, X.pmf(x)))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 52
    },
    "colab_type": "code",
    "executionInfo": {
     "elapsed": 270,
     "status": "ok",
     "timestamp": 1579714226748,
     "user": {
      "displayName": "Rohit Tripathy",
      "photoUrl": "https://lh3.googleusercontent.com/a-/AAuE7mBph_bUyWEn5XOlEFll2rKHIRUAoRasanRpNxpmgwo=s64",
      "userId": "06977202960713719461"
     },
     "user_tz": 300
    },
    "id": "5dJt5PChTVgK",
    "outputId": "1283d980-55ce-435a-ded2-b8f405064531"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "p(X=0) = 0.40\n",
      "p(X=1) = 1.00\n"
     ]
    }
   ],
   "source": [
    "# Evaluate the cumulative distribution function at every point of the support\n",
    "for x in X.support():\n",
    "    print('p(X={0:d}) = {1:1.2f}'.format(x, X.cdf(x)))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 69
    },
    "colab_type": "code",
    "executionInfo": {
     "elapsed": 491,
     "status": "ok",
     "timestamp": 1579714227134,
     "user": {
      "displayName": "Rohit Tripathy",
      "photoUrl": "https://lh3.googleusercontent.com/a-/AAuE7mBph_bUyWEn5XOlEFll2rKHIRUAoRasanRpNxpmgwo=s64",
      "userId": "06977202960713719461"
     },
     "user_tz": 300
    },
    "id": "Js6BlQKbTVgO",
    "outputId": "c5a0e11e-a755-41b1-8a2c-f16483f0df9e"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[1 1 1 1 1 1 0 1 0 1 0 1 1 1 0 1 1 1 1 1 0 0 0 1 0 1 1 1 1 0 1 0 1 0 1 0 1\n",
      " 1 1 0 1 0 0 1 0 1 1 1 0 1 0 1 1 0 1 1 0 1 0 0 1 1 1 0 0 0 1 1 0 1 0 1 1 0\n",
      " 1 1 1 1 0 0 0 1 0 1 1 1 1 1 1 0 0 0 1 0 0 0 1 1 0 0]\n"
     ]
    }
   ],
   "source": [
    "# Sample the random variable 100 times:\n",
    "xs = X.rvs(100)\n",
    "print(xs)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 34
    },
    "colab_type": "code",
    "executionInfo": {
     "elapsed": 482,
     "status": "ok",
     "timestamp": 1579714227284,
     "user": {
      "displayName": "Rohit Tripathy",
      "photoUrl": "https://lh3.googleusercontent.com/a-/AAuE7mBph_bUyWEn5XOlEFll2rKHIRUAoRasanRpNxpmgwo=s64",
      "userId": "06977202960713719461"
     },
     "user_tz": 300
    },
    "id": "9b7AwkK5TVgR",
    "outputId": "6ad2c090-1798-42f2-a4a2-0d6d45254042"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "E[X] = 0.60\n"
     ]
    }
   ],
   "source": [
    "# The expectation of the Bernoulli:\n",
    "print('E[X] = {0:1.2f}'.format(X.expect()))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 34
    },
    "colab_type": "code",
    "executionInfo": {
     "elapsed": 287,
     "status": "ok",
     "timestamp": 1579714227285,
     "user": {
      "displayName": "Rohit Tripathy",
      "photoUrl": "https://lh3.googleusercontent.com/a-/AAuE7mBph_bUyWEn5XOlEFll2rKHIRUAoRasanRpNxpmgwo=s64",
      "userId": "06977202960713719461"
     },
     "user_tz": 300
    },
    "id": "ze2DoUw2TVgV",
    "outputId": "a1a62879-5ff8-48ad-9c15-f4e5d5870019"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "V[X] = 0.24\n"
     ]
    }
   ],
   "source": [
    "# The variance of the Bernoulli:\n",
    "print('V[X] = {0:1.2f}'.format(X.var()))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "DB_kSGGETVgY"
   },
   "source": [
    "### Questions\n",
    "\n",
    "+ Rerun the code above for $\\theta = 0.8$.\n",
    "+ Verify using your calculation that ``scipy.stats`` is giving the correct expectation and variance.\n",
    "+ Plot the histogram of 1000 random samples from $X$."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "YilM9UJ8TVgZ"
   },
   "source": [
    "## Joint probability mass function of random variables\n",
    "\n",
    "Consider two random variables $X$ and $Y$.\n",
    "The *joint probability mass function* of the pair $(X,Y)$ is the function $f_{X,Y}(x,y)$ giving the probability that $X=x$ and $Y=y$.\n",
    "Mathematically (and introducing a simplified notation), we have:\n",
    "$$\n",
    "p(x,y) \\equiv p(X=x, Y=y) \\equiv f_{X,Y}(x,y) := \\mathbb{P}\\left(\\{\\omega: X(\\omega) = x, Y(\\omega)=y\\}\\right).\n",
    "$$\n",
    "\n",
    "### Properties of the joint probability mass function\n",
    "+ It is nonnegative:\n",
    "$$\n",
    "p(x,y) \\ge 0.\n",
    "$$\n",
    "+ If you sum over all the possible values of all random variables, you should get one:\n",
    "$$\n",
    "\\sum_x \\sum_y p(x,y) = 1.\n",
    "$$\n",
    "+ If you *marginalize* over the values of one of the random variables you get the pmf of the other.\n",
    "For example:\n",
    "$$\n",
    "p(x) = \\sum_y p(x,y),\n",
    "$$\n",
    "and \n",
    "$$\n",
    "p(y) = \\sum_x p(x, y).\n",
    "$$\n",
    "\n",
    "### The covariance operator\n",
    "\n",
    "The covariance operator measures how correlated two random variables $X$ and $Y$ are.\n",
    "Its definition is:\n",
    "$$\n",
    "\\mathbb{C}[X,Y] = \\mathbb{E}\\left[\\left(X-\\mathbb{E}[X]\\right)\\left(Y-\\mathbb{E}[Y]\\right)\\right].\n",
    "$$\n",
    "If $\\mathbb{C}[X,Y]$ is positive, then we say that the two random variables are correlated.\n",
    "If it is negative, then we say that the two random variables are anti-correlated.\n",
    "If it is zero, then we say that the two random variables are not correlated.\n",
    "We will talk more about this in a later lecture.\n",
    "\n",
    "A usefull property of the covariance operator is that it can give tell you something about the variance of the sum of two random variables.\n",
    "It is:\n",
    "$$\n",
    "\\mathbb{V}[X + Y] = \\mathbb{V}[X] + \\mathbb{V}[Y] + 2\\mathbb{C}[X,Y].\n",
    "$$\n",
    "\n",
    "### Joint probability mass function of many random variables\n",
    "Take $N$ random variables $X_1,\\dots,X_N$.\n",
    "We can define their joint probability mass function in the same way we did it for two:\n",
    "$$\n",
    "p(x_1,\\dots,x_N) \\equiv p(X_1=x_1,\\dots,X_N=x_N) \\equiv f_{X_1,\\dots,X_N}(x_1,\\dots,X_N) := \\mathbb{P}\\left(\\{\\omega: X_1(\\omega)=x_1,\\dots,X_N(\\omega)=x_N\\}\\right).\n",
    "$$\n",
    "Just like before, we can marginalize over any subset of random variables to get the pmf of the remaining ones.\n",
    "For example:\n",
    "$$\n",
    "p(x_i) = \\sum_{x_j,j\\not=i} p(x_1,\\dots,x_N).\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "mxy6hBAuTVga"
   },
   "source": [
    "## Conditioning a random variables on the values of another\n",
    "\n",
    "Take two random variables $X$ and $Y$ with joint pmf $p(x,y)$.\n",
    "Suppose that you observe $Y=y$ and you want to update your state of knowledge about the probability that $X=x$, i.e., you want to get the *conditional pmf* $p(x|y)$.\n",
    "Of course, this is done using Bayes' rule:\n",
    "$$\n",
    "p(x|y) = \\frac{p(x,y)}{p(y)}.\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "sQgqFkZqTVga"
   },
   "source": [
    "### Example: Drawing balls from a box without replacement\n",
    "\n",
    "Consider the following information I:\n",
    "\n",
    "> We are given a box with 10 balls 6 of which are red and 4 of which are blue. The box is sufficiently mixed so that when we get a ball from it, we don't know which one we pick. When we take a ball out of the box, we do not put it back.\n",
    "\n",
    "Now, assume that we represent red balls with a $0$ and blue balls with $1$.\n",
    "Let $X$ be the random variable corresponding to the outcome of the first draw and $Y$ the random variable corresponding to the outcome of the second draw.\n",
    "In this problem, it is not easy to find the joint pmf $p(x,y)$ directly.\n",
    "It is actually easier to first find $p(x)$, then $p(y|x)$ and finally reconstruct $p(x,y)$ using Bayes' rule.\n",
    "Let's do it.\n",
    "\n",
    "We showed in the previous lecture that:\n",
    "$$\n",
    "p(X=0) = \\frac{6}{10} = \\frac{3}{5},\n",
    "$$\n",
    "and\n",
    "$$\n",
    "p(X=1) = 1 - p(X=0) = \\frac{2}{5}.\n",
    "$$\n",
    "\n",
    "We also have:\n",
    "$$\n",
    "p(Y=0|X=0) = \\frac{5}{9},\n",
    "$$\n",
    "since if we draw a red first, $X=0$, there remain $9$ balls $5$ of which are red.\n",
    "$$\n",
    "p(Y=0|X=1) = \\frac{6}{9} = \\frac{2}{3},\n",
    "$$\n",
    "since if we draw a blue first, $X=1$, there remain $9$ balls $6$ of which are red.\n",
    "$$\n",
    "p(Y=1|X=0) = 1 - p(Y=0|X=0) = \\frac{4}{9},\n",
    "$$\n",
    "and\n",
    "$$\n",
    "p(Y=1|X=1) = 1 - p(Y=0|X=1) = \\frac{1}{3}.\n",
    "$$\n",
    "\n",
    "From Bayes' rule we have:\n",
    "$$\n",
    "p(x,y) = p(x)p(y|x),\n",
    "$$\n",
    "and we can completly tabulate this joint pmf:\n",
    "$$\n",
    "\\begin{split}\n",
    "p(X=0, Y=0) &= p(X=0)p(Y=0|X=0) = \\frac{3}{5}\\cdot \\frac{5}{9} \\approx 0.33\\\\\n",
    "p(X=0, Y=1) &= p(X=0)p(Y=1|X=0) = \\frac{3}{5}\\cdot \\frac{4}{9} \\approx 0.27\\\\\n",
    "p(X=1, Y=0) &= p(X=1)p(Y=0|X=1) = \\frac{2}{5}\\cdot \\frac{2}{3} \\approx 0.27\\\\\n",
    "p(X=1, Y=1) &= p(X=1)p(Y=1|X=1) = \\frac{2}{5}\\cdot \\frac{1}{3} \\approx 0.13.\n",
    "\\end{split}\n",
    "$$\n",
    "Notice that these sum to one.\n",
    "\n",
    "Finally, by marginalizing over $x$, we can find the pmf of $Y$:\n",
    "$$\n",
    "\\begin{split}\n",
    "p(Y=0) &= \\sum_x p(X=x,Y=0) = p(X=0, Y=0) + p(X=1, Y=0) \\approx 0.33 + 0.27 = 0.5\\\\\n",
    "p(Y=1) &= \\sum_x p(X=x,Y=1) = p(X=0, Y=1) + p(X=1, Y=1) \\approx 0.27 + 0.33 = 0.5. \n",
    "\\end{split}\n",
    "$$\n",
    "\n",
    "Let's find the covariance of the two random variables.\n",
    "We need:\n",
    "$$\n",
    "\\begin{split}\n",
    "\\mathbb{E}[X] &= 0\\cdot p(X=0) + 1\\cdot p(X=1) = 0.4,\\\\\n",
    "\\mathbb{E}[Y] &= 0\\cdot p(Y=0) + 1\\cdot p(Y=1) = 0.5.\n",
    "\\end{split}\n",
    "$$\n",
    "The covariance is:\n",
    "$$\n",
    "\\begin{split}\n",
    "\\mathbb{C}[X,Y] &= \\mathbb{E}\\left[\\left(X-\\mathbb{E}[X]\\right)\\left(Y-\\mathbb{E}[Y]\\right)\\right]\\\\\n",
    "&= \\sum_{x,y} p(X=x,Y=y)\\left(x-0.4\\right)(y-0.5)\\\\\n",
    "&\\approx 0.33\\cdot (0.2) + 0.27 \\cdot (-0.2) + 0.27 \\cdot (-0.3) + 0.33 \\cdot (0.3)\\\\\n",
    "&= 0.03.\n",
    "\\end{split}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "frVZ6m9gTVgb"
   },
   "source": [
    "## Independent random variables\n",
    "\n",
    "Take two random variables $X$ and $Y$.\n",
    "We say that the two random variables are independent given the background information $I$, and we write:\n",
    "$$\n",
    "X\\perp Y | I,\n",
    "$$\n",
    "if and only if conditioning on one does not tell you anything about the other, i.e.,\n",
    "$$\n",
    "p(x|y, I) = p(x|I).\n",
    "$$\n",
    "It is easy to show using Bayes' rule that the definition is consistent, i.e., you also get:\n",
    "$$\n",
    "p(y|x, I) = p(y|I).\n",
    "$$\n",
    "When there is no ambiguity, we can drop $I$.\n",
    "\n",
    "### Properties of independent random variables\n",
    "+ The joint pmf factorizes:\n",
    "$$\n",
    "p(x,y) = p(x)p(y).\n",
    "$$\n",
    "+ The expectation of the product is the product of the expectation:\n",
    "$$\n",
    "\\mathbb{E}[XY] = \\mathbb{E}[X]\\cdot \\mathbb{E}[Y].\n",
    "$$\n",
    "+ The covariance of two independent random variables is zero:\n",
    "$$\n",
    "\\mathbb{C}[X,Y] = 0.\n",
    "$$\n",
    "Be careful **the reverse is not true!**\n",
    "+ A consequence of the above property is that the variance of the sum of two independent random variables is the sum of the variables:\n",
    "$$\n",
    "\\mathbb{V}[X+Y] = \\mathbb{V}[X] + \\mathbb{V}[Y].\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "HqPoMDPYTVgd"
   },
   "source": [
    "## Example: The binomial distribution\n",
    "\n",
    "Suppose that you tossing $n$ times a coin with probability of heads $\\theta$ and let $X$ be the number of heads.\n",
    "The random variable $X$ is called the binomial random variable.\n",
    "We write:\n",
    "$$\n",
    "X\\sim B(n, \\theta).\n",
    "$$\n",
    "It is easy to show that its pmf is:\n",
    "$$\n",
    "p(X = k) = {n\\choose{k}}\\theta^k(1-\\theta)^{n-k},\n",
    "$$\n",
    "where ${n\\choose{k}}$ is the number of $k$ combinations out of $n$ elements:\n",
    "$$\n",
    "{n\\choose{k}} = \\frac{n!}{k!(n-k)!}.\n",
    "$$\n",
    "The expectation of the binomial is:\n",
    "$$\n",
    "\\mathbb{E}[X] = \\sum_{k=0}^n p(X=k) = \\sum_{k=0}^n{n\\choose{k}}\\theta^k(1-\\theta)^{n-k} = ?.\n",
    "$$\n",
    "It does seem very easy to carry out this sum.\n",
    "However, you can do something smarter.\n",
    "Notice that $X$ counts the number of heads in $n$ independent trials.\n",
    "Let's introduce the *independent* random variables $X_1,\\dots,X_n$ corresponding to the result of these trials.\n",
    "All these varaibles are:\n",
    "$$\n",
    "X_i \\sim \\operatorname{Bernoulli}(\\theta).\n",
    "$$\n",
    "The number of heads is simply:\n",
    "$$\n",
    "X = X_1 + \\dots X_n.\n",
    "$$\n",
    "Since all the random variables on the right hand-side are independent, we get:\n",
    "$$\n",
    "\\mathbb{E}[X] = \\mathbb{E}[X_1] + \\dots + \\mathbb{E}[X_n] = \\theta + \\dots + \\theta = n\\theta.\n",
    "$$\n",
    "Similarly, we can find the variance of $X$:\n",
    "$$\n",
    "\\mathbb{V}{[X]} = \\mathbb{V}[X_1] + \\dots + \\mathbb{V}[X_n] = n \\theta (1-\\theta).\n",
    "$$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 69
    },
    "colab_type": "code",
    "executionInfo": {
     "elapsed": 380,
     "status": "ok",
     "timestamp": 1579714228475,
     "user": {
      "displayName": "Rohit Tripathy",
      "photoUrl": "https://lh3.googleusercontent.com/a-/AAuE7mBph_bUyWEn5XOlEFll2rKHIRUAoRasanRpNxpmgwo=s64",
      "userId": "06977202960713719461"
     },
     "user_tz": 300
    },
    "id": "yQsg4bktTVgf",
    "outputId": "a15784be-cfac-49fb-bc0e-e022754bb773"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[3 3 3 2 2 3 3 2 4 4 4 3 2 4 4 3 3 1 2 2 1 4 4 3 4 2 5 4 2 3 4 3 4 2 2 2 4\n",
      " 3 3 4 5 3 3 3 3 3 3 5 3 2 1 2 3 4 2 4 4 3 3 3 1 2 3 1 3 3 3 1 3 2 3 2 4 1\n",
      " 3 3 3 4 3 2 1 4 2 3 2 2 3 3 3 4 5 2 3 2 1 3 2 2 4 1]\n"
     ]
    }
   ],
   "source": [
    "# Let's draw histograms of the binomial\n",
    "n = 5\n",
    "theta = 0.6\n",
    "X = st.binom(n, theta)\n",
    "# Here are some samples\n",
    "print(X.rvs(100))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 34
    },
    "colab_type": "code",
    "executionInfo": {
     "elapsed": 482,
     "status": "ok",
     "timestamp": 1579714228728,
     "user": {
      "displayName": "Rohit Tripathy",
      "photoUrl": "https://lh3.googleusercontent.com/a-/AAuE7mBph_bUyWEn5XOlEFll2rKHIRUAoRasanRpNxpmgwo=s64",
      "userId": "06977202960713719461"
     },
     "user_tz": 300
    },
    "id": "mZiFkgsPTVgi",
    "outputId": "8d67f0ce-57b4-4479-faa3-b62b5b07505b"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "E[X] = 3.00\n"
     ]
    }
   ],
   "source": [
    "# Here is the expectation\n",
    "print('E[X] = {0:1.2f}'.format(X.expect()))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 34
    },
    "colab_type": "code",
    "executionInfo": {
     "elapsed": 536,
     "status": "ok",
     "timestamp": 1579714228945,
     "user": {
      "displayName": "Rohit Tripathy",
      "photoUrl": "https://lh3.googleusercontent.com/a-/AAuE7mBph_bUyWEn5XOlEFll2rKHIRUAoRasanRpNxpmgwo=s64",
      "userId": "06977202960713719461"
     },
     "user_tz": 300
    },
    "id": "C25VqyT5TVgl",
    "outputId": "ce877e7f-0bb9-443e-af46-e58fa774faa4"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "V[X] = 1.20\n"
     ]
    }
   ],
   "source": [
    "# Here is the variance\n",
    "print('V[X] = {0:1.2f}'.format(X.var()))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 318
    },
    "colab_type": "code",
    "executionInfo": {
     "elapsed": 1040,
     "status": "ok",
     "timestamp": 1579714229668,
     "user": {
      "displayName": "Rohit Tripathy",
      "photoUrl": "https://lh3.googleusercontent.com/a-/AAuE7mBph_bUyWEn5XOlEFll2rKHIRUAoRasanRpNxpmgwo=s64",
      "userId": "06977202960713719461"
     },
     "user_tz": 300
    },
    "id": "dGuR-A0cTVgo",
    "outputId": "ef9f1139-5a47-4856-da0d-f24589b047df"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Text(0, 0.5, '$p(x)$')"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAZ0AAAEZCAYAAABM/vhsAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjMsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+AADFEAAATsElEQVR4nO3df7BndX3f8edLsP4o4667ok2HOgwqEYQIiW38gYJK0srK8iOE+GMiiWJCM6jTplUTtSQwJSSRmFLbTIxk3EmbGpphRbqbWKO7jFGM4o9QXIr4AzXRIQjuXSGAEN7945xrr3e/u/d7v/d7P98f+3zM3Pne+znn8/2+z+zufe3nnM85n1QVkiS18KhJFyBJOnQYOpKkZgwdSVIzho4kqRlDR5LUzOGTLmCaJXmYLpj3TboWSZohTwAeqar9MiZOmT6wJI8A2bBhw6RLkaSZsbCwAFBVtd/ZNEc6B7dvw4YNG/bu3TvpOiRpZmzcuJGFhYWBZ4i8piNJasbQkSQ1Y+hIkpoxdCRJzRg6kqRmDB1JUjOGjiSpGUNHktSMN4dKGquj37pj0iWMzR1XbJl0CXPHkY4kqRlDR5LUjKEjSWrG0JEkNWPoSJKaMXQkSc0YOpKkZgwdSVIzho4kqRlDR5LUjKEjSWrG0JEkNWPoSJKaMXQkSc0YOpKkZgwdSVIzho4kqRlDR5LUjKEjSWrG0JEkNWPoSJKamarQSXJEkquSfCvJ/UluSrJ1iH6vS/LxJHcl+V6SbyT5H0mOb1G3JGk4UxU6wHbg1cDbgS3AHmB7kjNW6Pck4C+A1wM/AbwDOAn4qyRPX79yJUmrcfikC1jUB8vpwLlVtb1v2wUcA1wJ7DxQ36r6zWVNNyT5JHAr8Crg0nUpWpK0KtM00jkHWACuW2yoqgK2Ac8c4VTZt/vXh8ZTniRpraYpdE4A9lTVI8vab16y/aCSHJbkMUl+GHgvcCddaEmSpsDUnF4DNgNfHNB+z5LtK7lzyX5fBF5cVd880M5J9q7wfhuG+ExJ0pCmaaQDUCNuW/RS4Hl0kxEWgF1JnjWOwiRJazdNI527GTya2dS/3jNg2w+oqr/uv/1kkg8CtwOXA2cdYP+NB3u/fiTkaEeSxmSaRjpfAI5LsrymE/vXW1bzZlV1L92U62PHUJskaQymKXS2AxuBM5e1vwa4rar2rObNkmwCng18aTzlSZLWappOr+0EdgFXJ9kMfBW4ADiFJafHkuwGTq2qLGn7PPBHwG3AfXSjmzcCjwcua1S/JGkFUxM6VVVJzqa7BnM53ahnD93Notev0P2TwM8DTwUeRzeL7QbgZ6pqVaflJEnrZ2pCB6Cq9gEX918H2ue0AW0XrWNZkqQxmaZrOpKkOWfoSJKaMXQkSc0YOpKkZgwdSVIzho4kqRlDR5LUjKEjSWrG0JEkNWPoSJKaMXQkSc0YOpKkZgwdSVIzho4kqRlDR5LUjKEjSWrG0JEkNTNVK4dK0qw7+q07Jl3C2NxxxZaxv6cjHUlSM4aOJKkZQ0eS1IyhI0lqxtCRJDVj6EiSmjF0JEnNGDqSpGYMHUlSM4aOJKkZQ0eS1IyhI0lqxtCRJDVj6EiSmjF0JEnNGDqSpGZcxE1aBy7kJQ3mSEeS1IyhI0lqxtCRJDVj6EiSmjF0JEnNGDqSpGYMHUlSM4aOJKkZQ0eS1IyhI0lqxtCRJDVj6EiSmhkpdJIcm+ScJL+Y5Bf675+x1mKSHJHkqiTfSnJ/kpuSbB2i34VJPpjka32/2/v3OXKtNUmSxmfop0wnOQ64CPhp4CmLzf1r9fvcCVwD/H5V3TpCPduBHwXeDHwV+Dlge5Izq2rnQfr9OrAL+BXgb4HjgUuArUlOqqq9I9QiSRqzFUMnydOA3wTOAe4HPgbcCHwZuJsueDYBTweeC1wIvCHJtcBbquorwxSS5AzgdODcqtret+0CjgGuBA4WOidX1d8t+fmGJHuA3cDPAv95mBokSetrmJHOHuD/0I06rq2q+w62c5J/DJwHvLHv+9ghazkHWACuW2yoqkqyDXhPkuOras+gjssCZ9Gn+9ejhvx8SdI6GyZ0zq+q61berdOH0jZgW5KzVlHLCcCeqnpkWfvNS7ev4v1e0r/ecqAdkqx02m3DKj5PkrSCFScSrCZw1th3M3DPgPZ7lmwfSpJNwFXA7XTXmCRJU2DVy1UnOb+q1usXeY247fuSPB74AN11phdV1YMHfMOqjSu8114c7UjS2IwyZfqPk/zi2CvpJiUMGs1s6l8HjYJ+QJLHAR8ETgbOqKqbV+giSWpolNB5H/Bfk7xt0MYkz0vysRHe9wvAcUmW13Ri/3rAazP95z6WbhLC84CXV9UnRqhBkrSOVh06VXUh8FvAZUnetdie5If7adJ/STfSWK3twEbgzGXtrwFuO9DMtf6zH0N3Su2FwFlVdcMIny9JWmervqYDUFW/0t8IemV/1/+9wGvprrv8PnDpCG+7k+4Gz6uTbKa7OfQC4BTg+7PgkuwGTq2qLOn7p8C/7D/33iTPXbLtrqr68gj1SJLGbKTQ6f0BsAV4FV3YvB94x7A3gy7X35NzNnB5/7WRbor0uVV1/QrdX96//of+a6ltdPcYSZImbJTZa48GfonukTNHAp8FTgIeDXxjLcVU1T7g4v7rQPucNqAtA3aVJE2ZUSYS3A78DnAXsLWqnkM32tkK7ExyxBjrkyTNkVFC5zDg9cCzq2oHQH/fzla6Z6/tSvKk8ZUoSZoXo1zTeUZVPbC8sar+d5LTgR3AJ4Bj11qcJGm+jDJler/AWbLtr+imLQ/7kE9J0iFk7CuH9uvovGDc7ytJmn0rhk6Sl672TavqG33f00cpSpI0n4YZ6fx5ko8meXmSw1baOcmj++Wrb+DgC69Jkg4xw0wkOJluivQHgW8n+TDwKbqVQ+/h/68c+gy62Wsv6X/+EN39O5IkAUOETlXdAvxkkufR3RR6FvBK9l9qIMA+4Frg96rq00iStMTQU6ar6kbgxv4U248Bx9M9kaDobhS9BfjcgJU/JUkCRrhPp6r+ge702qfGX44kaZ6N9MDPJE8BzgeOpnvC9GeBD1fV34+vNEnSvBnlgZ8vpJuV9ni66ziL7k5yWVVdNa7iJEnzZZSbQ9/Zv74WeCrdaOcVdLPZfjfJH4+nNEnSvBnl9NoJwJVVtW1J29eBa5K8DnhPkk9U1bvHUqEkaW6MMtL5Ll3I7KeqrqZbzO2itRQlSZpPo4TOLuCMFbY/bbRyJEnzbJTQeQ/w/CRvOsD2o4FvjlyRJGlujXJN5yPAw8DvJDkHeC/wmb7tVOBNwFvGVqEkaW6MEjqXAc+me67ai/qvpY/E+RywN8kJwP+tqofXXKUkaS6M8kSCSxa/T7KR7oGgJ/VfJwM/Avw3uiB6KMltwM1V9bNjqViSNLNGeiLBoqraSzdxYNdiW5J/RDetejGETgbOXMvnSJLmw5pCZ5Cq+h7dY3E+O+73liTNtrEvVy1J0oEYOpKkZgwdSVIzho4kqRlDR5LUjKEjSWrG0JEkNWPoSJKaMXQkSc0YOpKkZgwdSVIzho4kqRlDR5LUjKEjSWrG0JEkNWPoSJKaMXQkSc0YOpKkZgwdSVIzho4kqRlDR5LUjKEjSWrG0JEkNWPoSJKamarQSXJEkquSfCvJ/UluSrJ1iH6nJPnDJJ9P8lCSalGvJGl1pip0gO3Aq4G3A1uAPcD2JGes0O+lwGnAl4DPr2eBkqTRTU3o9MFyOnBhVV1dVR8FLgBuBK5coftlVXVMVZ0HfHydS5UkjWhqQgc4B1gArltsqKoCtgHPTHL8gTpW1SPrX54kaa0On3QBS5wA7BkQIDcv3T7OD0yyd4VdNozz8yTpUDdNI53NwD0D2u9Zsl2SNMOmaaQDcLBZZ2OfkVZVGw+2vR8JOdqRpDGZppHO3QwezWzqXweNgiRJM2SaQucLwHFJltd0Yv96S+N6JEljNk2hsx3YCJy5rP01wG1VNdZJBJKk9qbpms5OYBdwdZLNwFfp7tM5BThrcacku4FTqypL2o4ETu1/fHrfdl7/8x1VddO6Vy9JWtHUhE5VVZKzgcv7r410U6TPrarrV+j+LOB/Lmtb/Hkb8HNjLFWSNKKpCR2AqtoHXNx/HWif0wa07Qay386SpKkyTdd0JElzztCRJDVj6EiSmjF0JEnNGDqSpGYMHUlSM1M1ZVrz5ei37ph0CWNxxxVbJl2CNDcc6UiSmjF0JEnNGDqSpGYMHUlSM4aOJKkZQ0eS1IyhI0lqxtCRJDVj6EiSmjF0JEnNGDqSpGYMHUlSM4aOJKkZQ0eS1IyhI0lqxtCRJDVj6EiSmjF0JEnNGDqSpGYMHUlSM4aOJKkZQ0eS1IyhI0lqxtCRJDVj6EiSmjF0JEnNGDqSpGYMHUlSM4aOJKkZQ0eS1IyhI0lqxtCRJDVj6EiSmjF0JEnNGDqSpGYMHUlSM4aOJKkZQ0eS1IyhI0lq5vBJFzDPjn7rjkmXMDZ3XLFl0iVImgNTNdJJckSSq5J8K8n9SW5KsnXIvk9L8oEkC0m+m2RnkuPXu2ZJ0vCmKnSA7cCrgbcDW4A9wPYkZxysU5InAx8DjgYuAF4JbAJuSHLUehYsSRre1Jxe64PldODcqtret+0CjgGuBHYepPu/A54IPKeqvtn3vRH4KvA24F+vY+mSpCFN00jnHGABuG6xoaoK2AY8c4VTZecAH14MnL7v3cD1wLnrU64kabXS/V6fvH5kUlX1/GXtPw58EviZqrpmQL/HAfcBV1TVry7b9hbgCuApVfV3A/ruXaGsDQAbNmxYzaF8374HHh6p3zR6wmNXPyiel+M/lI8dVn/8h/Kxg8cPsLCwAN3v8/0GNlNzeg3YDHxxQPs9S7YP8kQgS/Y7UN/9QmdItbCwsG/Evi0sJuLCen7IwoPr+e4j89jX+djh0D7+Q/nYYU3H/wTgkUEbpil0AA427FppSLbqvlW1ccWKptziaG0ejmW1PPZD89jh0D7+WT/2abqmczeDRzOb+tdBIxmA79CFyih9JUkNTVPofAE4Lsnymk7sX28Z1Kmq7ge+ApwwYPOJwF2DrudIktqbptDZDmwEzlzW/hrgtqras0Lfn0jyTxYbkmzq3+vacRcqSRrNNIXOTmAXcHWS1yZ5cZL3AacA/35xpyS7kyy/RvNOuotqO5OclWQLsAN4GLi8SfWSpBVNTej09+ScDbyfLij+DPgRuptFr1+h753AC4FvAH8E/AmwF3hRVX19PeuWJA1vau7T0WhmfSbLWnjsh+axw6F9/LN+7FMz0pEkzT9HOpKkZhzpSJKaMXQkSc0YOpKkZgydGbSWFVbnQZKjkvynJH+Z5N4kleS0Sde13pK8NMn7ktyW5O+T/E2Sa5OcuHLv2Zfk+Uk+lORvkzyQ5K4kH03ysknXNglJfq3/u//5SdeyGobObBpphdU58nS61WHvBT4y4Vpaugh4KvAu4GXAv+1//nSS506ysEaeCNwG/DLwr4BfAB6kuyn8FZMsrLUkzwLeAtw56VpWy9lrM6YPlh384AqroVuue3NVHTfJ+lpI8qiqeqT//my6EH5xVe2eaGHrLMmTlz9HMMlGuhVyP1pVPzWZyiYnyeF0x397Vb1k0vW00D+f8hPAp+meL7mxqk6abFXDc6Qze9aywupcWAycQ82gB9dW1V7gduCo9hVNXlU9TPfv4aFJ19LQv6H7837bpAsZhaEze04A9gz4xXvzku06RCQ5ku7PfOBT2OdRkkclOTzJP03y68CxdKcc516SY4BLgYurapoXlzygaVvETSsbdYVVzZn+tOp76P7z+M4Jl9PSNcDiqcR9wPlV9ecTrKeJ/s/7D4APVdUHJl3PqBzpzKa1rLCq+fHbdA/Jvaiqbp10MQ29GfgXwFa6p9Nfk+SVky2pidcDzwHeMOlC1sKRzuwZdYVVzZEk/5FuFtebqup9Ey6nqar6Ct3CjQDXJ7ke+C9J/mRer/cleRLwW8BvAPf1E0ig+x1+WP/zA1X1wKRqHJYjndkz0gqrmh9JLgV+FXhzVV016XqmwKfoplMfOelC1tFRwAa60PnOkq8X0F3T+w7wa5MqbjUc6cye7cDr6FZFvW5J+zArrGrGJbkEeAfwjqr67UnXM2n9dY7T6NbPunuy1ayrLwEvHtD+u8ARwIXATKwdZujMnqUrrG6mu0fhAroVVs+aZGEtJTmv//af96+n9qcg7quqP5tQWesqyS/T/W/2fwF/seyG0Aer6nMTKayRJP8d+BrwGeDbwA/R/d1/CfCGfvr0XKqqe4Hdy9uXrK2z37Zp5c2hMyjJE+hWVz0P2Ej3RIJLZ3lGy2oNWLJ80deq6uiWtbSSZDdw6gE2z+1xL0pyMd2TOI6lO9W0ANwEvHul1YXnVf93YqZuDjV0JEnNOJFAktSMoSNJasbQkSQ1Y+hIkpoxdCRJzRg6kqRmDB1JUjOGjiSpGUNHktSMoSNJasbQkWZEkscl+ZskX0/ymGXb3pvkH5K8YlL1ScMwdKQZUVX3A5cA/wz4pcX2JL9Bt9zFG6rq/RMqTxqKD/yUZkiSw4C/Bp4MHEO3jsq7gEuq6tJJ1iYNw9CRZkySlwPXAx+hW0vm3VX1xslWJQ3H0JFmUJLPAD8KvB94VfkPWTPCazrSjElyPrC4aNd3DRzNEkc60gxJ8pN0p9auBx4Cfho4sapunWhh0pAMHWlGJPlxuus4nwJeBhwF3ArsrKqzJ1mbNCxPr0kzIMlxwA7gi8DZVfVgVX0ZuBo4K8kLJlqgNCRHOtKUS/JU4OPA94DnV9WdS7b9EPBl4HNVZfBo6hk6kqRmPL0mSWrG0JEkNWPoSJKaMXQkSc0YOpKkZgwdSVIzho4kqRlDR5LUjKEjSWrG0JEkNfP/ANpX3W+MG2brAAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "# Let's draw the pmf\n",
    "fig, ax = plt.subplots()\n",
    "xs = range(n)\n",
    "ax.bar(xs, X.pmf(xs))\n",
    "ax.set_xlabel('$x$')\n",
    "ax.set_ylabel('$p(x)$')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "Jzh5WXoMTVgr"
   },
   "source": [
    "### Questions\n",
    "\n",
    "+ Start increasing the number of trials $n$. How does the resulting pmf look like?\n",
    "This starts to look like a bell curve. And indeed it is!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "TWSyuUq_TVgs"
   },
   "source": [
    "## Example: The categorical distribution\n",
    "\n",
    "This is a generalization of the Bernoulli also known as *multinulli*.\n",
    "It is the probability distribution assigned to a random variable taking $K$ different values each one with a given, but different, probability.\n",
    "It is:\n",
    "$$\n",
    "p(X=k) = p_k.\n",
    "$$\n",
    "For example, if all the different values are equally probably, then we could have:\n",
    "$$\n",
    "p(X=k) = \\frac{1}{K}.\n",
    "$$\n",
    "Let's see how we can sample from it."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "WsMDMPOyTVgt"
   },
   "outputs": [],
   "source": [
    "# Just pick some probabilities\n",
    "ps = [0.1, 0.3, 0.4, 0.2] # this has to sum to 1\n",
    "# And here are the corresponding values\n",
    "xs = [0, 1, 2, 3]\n",
    "# Here is how you can define a categorical rv:\n",
    "X = st.rv_discrete(name='Custom Categorical', values=(xs, ps))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 34
    },
    "colab_type": "code",
    "executionInfo": {
     "elapsed": 403,
     "status": "ok",
     "timestamp": 1579714229671,
     "user": {
      "displayName": "Rohit Tripathy",
      "photoUrl": "https://lh3.googleusercontent.com/a-/AAuE7mBph_bUyWEn5XOlEFll2rKHIRUAoRasanRpNxpmgwo=s64",
      "userId": "06977202960713719461"
     },
     "user_tz": 300
    },
    "id": "aY9_UWgnTVgv",
    "outputId": "3a86bd5e-3de1-48c4-ee25-235d27ed2603"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[2 2 2 2 3 2 0 3 1 2]\n"
     ]
    }
   ],
   "source": [
    "print(X.rvs(size=10))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 34
    },
    "colab_type": "code",
    "executionInfo": {
     "elapsed": 406,
     "status": "ok",
     "timestamp": 1579714229873,
     "user": {
      "displayName": "Rohit Tripathy",
      "photoUrl": "https://lh3.googleusercontent.com/a-/AAuE7mBph_bUyWEn5XOlEFll2rKHIRUAoRasanRpNxpmgwo=s64",
      "userId": "06977202960713719461"
     },
     "user_tz": 300
    },
    "id": "HbX_T-9PTVgz",
    "outputId": "797e269c-e872-4779-f06a-3287664d24d6"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0.4\n"
     ]
    }
   ],
   "source": [
    "# You can get the pmf:\n",
    "print(X.pmf(2))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "ocG8zYRlTVg0"
   },
   "source": [
    "## Example: The Poisson distribution\n",
    "\n",
    "The Poisson distribution models the number of times an event occurs in an interval of space or time.\n",
    "For example, a Poisson random variable $X$ may be:\n",
    "\n",
    "+ The number earthquakes greater than 6 Richter occuring over the next 100 years.\n",
    "+ The number of major floods over the next 100 years.\n",
    "+ The number of patients arriving at the emergency room during the night shift.\n",
    "+ The number of electrons hitting a detector in a specific time interval.\n",
    "\n",
    "The Poisson is a good model when the following assumptions are true:\n",
    "+ The number of times an event occurs in an interval takes values $0,1,2,\\dots$.\n",
    "+ Events occur independently.\n",
    "+ The probability that an event occurs is constant per unit of time.\n",
    "+ The average rate at which events occur is constant.\n",
    "+ Events cannot occur at the same time.\n",
    "\n",
    "When these assumptions are valid, we can write:\n",
    "$$\n",
    "X\\sim \\operatorname{Pois}(\\lambda),\n",
    "$$\n",
    "where $\\lambda>0$ is the rate with each the events occur.\n",
    "The pmf of the Poisson is:\n",
    "$$\n",
    "p(X=k) = \\frac{\\lambda^ke^{-\\lambda}}{k!}.\n",
    "$$\n",
    "The expectation of the Poisson is:\n",
    "$$\n",
    "\\mathbb{E}[X] = \\sum_{k=0}^\\infty k p(X=k) = \\lambda.\n",
    "$$\n",
    "The variance is:\n",
    "$$\n",
    "\\mathbb{V}[X] = \\dots = \\lambda.\n",
    "$$\n",
    "\n",
    "Let's look at a specific example.\n",
    "Historical data show that at a given region a major earthquake occurs once every 100 years on average.\n",
    "What is the probability that $k$ such earthquakes will occur within the next 100 years.\n",
    "Let $X$ be the random variable corresponding to the number of earthquakes over the next 100 years.\n",
    "Assuming the Poisson model is valid, the rate parameter is $\\lambda = 1$ and we have:\n",
    "$$\n",
    "X\\sim \\operatorname{Pois}(1).\n",
    "$$\n",
    "The probabilities are:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 318
    },
    "colab_type": "code",
    "executionInfo": {
     "elapsed": 648,
     "status": "ok",
     "timestamp": 1579714230472,
     "user": {
      "displayName": "Rohit Tripathy",
      "photoUrl": "https://lh3.googleusercontent.com/a-/AAuE7mBph_bUyWEn5XOlEFll2rKHIRUAoRasanRpNxpmgwo=s64",
      "userId": "06977202960713719461"
     },
     "user_tz": 300
    },
    "id": "gI9SrpJoTVg1",
    "outputId": "99f4aa49-b6d8-4a93-9e43-6d0161f9880c"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Text(0, 0.5, 'Probability of occurance')"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAcwAAAEZCAYAAAAaKBUaAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjMsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+AADFEAAAgAElEQVR4nO3dd7wcZdn/8c83QURaQiKoqBBApCPYUaQIKNJDV5TyAGJB5VFBeEBBlCZgQUFEIgEURNRQBEVKQpEfSCjSQw0ghBoSekuu3x/3vWSymbNn5pw92c3J9/167eucveeemWtmdveacs89igjMzMystSGdDsDMzGxe4IRpZmZWgROmmZlZBU6YZmZmFThhmpmZVbBApwPoJpLeIO1EPNfpWMzM5iGLAzMjYlDnFPm2klkkzQQ0bNiwTodiZjbPmD59OkBExKA+azmo9wb64Llhw4YNmzZtWqfjMDObZwwfPpzp06cP+jNztROmpEWAdYB3AJdFxBNtj8rMzKzL1Dp8lvRV4FHgn8AZwGq5fElJr0j6cvtDNDMz67zKCVPSdsCJwHhgL0CNYRHxFPAPYOt2B2hmZtYN6hxh7g+Mj4jRwPklwycCq7clKjMzsy5TJ2GuAYxrMXwKsFT/wjEzM+tOdRLmjF7qLw282L9wzMzMulOdhPkf4LNlAyQNAXYAbmhHUGZmZt2mTsL8FfA5ST8CRjTGl7QScC6pxewJbY7PzMysK1S+DzMizpG0BnAwcFAu/geptayAQyPi7+0Pcd4x6sCLOh1CJZOP3rxy3cG4TGZmfVGr44KIOETSX4FdgJVJifJe4MyImDgA8ZmZmXWF2j39RMRNwE0DEIuZmVnXqtNxwQhJa7YYvqakJdoTlpmZWXep0+jnJ8DYFsNPA47qVzRmZmZdqk7C3BC4sMXwC4CN+xeOmZlZd6qTMJcGHm4x/L+5jpmZ2aBTJ2G+CCzbYviywKv9C8fMzKw71UmY1wO7SVqseUAu2xX4d7sCMzMz6yZ1EuZxwHuAayVtL+l9klaQtD1wbR527EAEaWZm1ml1evoZL+lrwC+Ac5oGvw7sGxGXtTM4MzOzblG3p5/fSPobsCPwPlJPP5OAP0fEowMQn5mZWVfoS08/jwI/G4BYzMzMulada5hmZmbzrVpHmJLWAfYFVgRGkk7JFkVErNCm2MzMzLpG5YQpaVdS93evA/fQuhMDMzOzQaXOEebBpAY+G0fEYwMUj5mZWVeqcw1zWeDXTpZmZjY/qpMw/wu8daACMTMz62Z1EubJwC6Shg5UMGZmZt2qzjXMG4HtgH9LOhF4EJjRXCkirmpTbGZmZl2jTsK8vPD/qUA0DVcu8xGomZkNOnUS5h4DFoWZmVmXq9P5+ukDGYiZmVk3c9d4ZmZmFdTufF3SO4APA0tQknAj4ow2xGVmZtZV6nSNNwQ4EdiL1kemTphmZjbo1Dkl+11gH+BsYDdSq9gDga8D9wITgU3aHaCZmVk3qJMwdwMuiYhdgb/nshsj4mTgQ8Db818zM7NBp07CXJ5ZiXJm/vsWgIh4kfQkk73aF5qZmVn3qJMwXyY92gvgBVInBUsVhj8OvLdNcZmZmXWVOgnzIWAFgIh4HbgP2LQwfGPgifaFZmZm1j3qJMwrgNGF92cCn5c0XtIEYAfgT22MzczMrGvUSZjHAV+T1HjE11HAr4APAKsBpwCH1pm5pEUlnSBpiqSXJU2UtFWF8faU9C9JT0l6TdIjks6WtGqd+ZuZmVVVp2u8KcCUwvsZwDfzq6/GAR8EDiA9/WR3YJykLSPi4hbjvR24DDgWeBZYDvgecL2ktSPivn7EZGZmNodKCVPSosAFwB8iYkw7ZixpM9J1z20jYlwuG09qjXs80GPCjIhjmoqulHQdcBfwBeDwdsRoZmbWUOmUbES8AHykzfMeDUwHzi/MJ4DTgZX7cHr16fz39Za1zMzM+qBOX7K3AKu0cd6rA3dGxMym8luLw1tNQNJQ0jKMAo4htdLt8akqkqb1EtOwXoabmdl8qk6jn0OBvSVt2KZ5jwSmlpRPLQzvzRPAK8DdpGS+YUQ81p7wzMzMZqlzhPlF4GHgMkn/Ae4BXmqqExGxZ41pRh+HNWwEvI103XM/YLykjSLijtIJRgxvNbF8BOqjTDMzm0OdhLl74f+18qtZAFUT5jOUH0WOyH/Ljj5nn1nEf/K/10m6gNQJ/JHA1hVjMDMzq6TyKdmIGFLhNbTGvO8AVsmPDStaI/+9vca0Gg2T7gTeX2c8MzOzKupcw2y3ccBwYMum8l2BSRHRssFPM0kjSJ0o+B5MMzNruzqnZNvtYmA8MEbSSFLHBbsB61I4pZq73Vs/IlQou4XUNd8k4EXSUeU3gYWBH82l+M3MbD5SOWFKuqJCtYiIjapMLyJC0jaka45Hko427yR1ZHBhL6NfB+wBLENq9PMEcCWwU0TUOpVrZmZWRZ0jzOWZs+XqAsC7SKd2nyYd7VUWEc8B++ZXT3U2KCn7Sp35mJmZ9VedvmRHlZXnzti/TTriW789YZmZmXWXfjf6iYhXI+Io4Hrgp/0PyczMrPu0s5XsNcBn2zg9MzOzrtHOhLkcsGAbp2dmZtY16rSSXaaHQSNIj+n6JjChDTGZmZl1nTqtZCfTc/+uInWA3p+HSZuZmXWtOgnzcOZMmEHq8/Ue4LKSR3WZmZkNCnVuKzlsAOMwMzPrap3sS9bMzGyeUTlhSvqhpB67nZN0q6RD2hOWmZlZd6lzhDkauLTF8EuB7fsXjpmZWXeqkzCXI7WE7cmkXMfMzGzQqXsNc3iLYUsAdR4gbWZmNs+okzDvoPCcyiJJArai9RGomZnZPKtOwhwDfFzSWElLNgrz/78DPp7rmJmZDTp17sP8raT1gV2BL0maQuq4YGlSTz/nRMSvByZMMzOzzqp1DTMivgjsDPwNmA48D1wA7BgRn29/eGZmZt2hTtd4AETEn4A/DUAsZmZmXatOxwULSFq8xfDFJdVOwGZmZvOCOqdkjwcmthh+A3BM/8IxMzPrTnUS5meBv7QY/hfgc/0Lx8zMrDvVSZjvBe5vMfyBXMfMzGzQqZMwXwPe1WL4OwE/D9PMzAalOgnzZmBHSQs2D8hlOwG3tiswMzOzblInYZ4IrAZcJOnDkhbMrw+T7stcFfjVQARpZmbWaXV6+vmLpKOAg4DrSb38BCnpCjgmIs4ZkCjNzMw6rNZ9kxFxsKTzgC8C7yMlyknAWRFxwwDEZ2Zm1hX60tPPDaR7Ls3MzOYbfeqZR9JIZj0s+sGIeKZ9IZmZmXWfWp2vS/qApCuBJ0nXMa8HnpQ0QdKaAxGgmZlZN6h8hClpdeAaYCHSE0puz4NWA7YErpb0iYi4o+1RmpmZdVidU7KHA68Dn4iI24oDcjK9KtfZrn3hmZmZdYc6p2TXA05sTpYAEXE7cBKwfrsCMzMz6yZ1EuYiwOMthk/JdczMzAadOgnzAWCLFsO3yHXMzMwGnToJ8wzgs5LOkrSapKH5tbqkPwCfAcYOSJRmZmYdVqfRz3HAB4GdSR2tN55M0uga70+kh0ybmZkNOnX6kp0B7CTpVGAbUscFIj0j87yIuGxgQjQzM+u8vnSNdylw6QDEYmZm1rVq9fRjZmY2v3LCNDMzq8AJ08zMrAInTDMzswqcMM3MzCroMWFKekDSVoX3P8idrLeNpEUlnSBpiqSXJU0szrPFeHtJukDSQ3m8e/N0lmxnfGZmZg2tjjCXARYrvD8MaPczL8cBuwCHAJsDdwLjJG3Wy3g/BJ4DDgI2BX4K7AjcIGl4m2M0MzNreR/mo8AaTWXRrhnnpLgxsG1EjMtl44HlST0GXdxi9LUj4snC+ysl3QlMAL4E/LJdcZqZmUHrhHk+cICkTYGpuewQSXu3GCciYqOK8x4NTM/zeXNkSacDp0haNSLu7GEmT5YU35D/vqfi/M3MzCprlTC/BzxLOgpclnR0uSSwcJvmvTpwZ0TMbCq/tTi8xvQ+nf/e3lMFSdN6mcawGvMzM7P5SI8JMyJeBg7NLyTNBPaLiLPaNO+RwD0l5VMLwyuRNAI4AbiX1Am8mZlZW9XpS3YP4No2z7/VNdFK10slLQycB4wA1ouIV3ucYETLBkH5CNRHmWZmNoc6Tys5vfG/pJGkp5UAPBgRz/Rh3s9QfhQ5Iv+dWjJsNpLeBlwArA18NiJu7WUUMzOzPqnVcYGkD0i6EngSuD6/npQ0QVLdW07uAFaR1BxDo2Vuj9cicywLkRoMrQNsERHtPvo1MzN7U+UjzNxpwTXAQqSjukZCWw3YErha0ici4o6KkxwH7JnHPb9QviswqacWsjmWt5JOw34K2DIirqy6HGZmZn1R5xrm4cDrwCci4rbigJxMr8p1tqs4vYuB8cCYfIr3QWA3YF1g68K0JwDrR4QK4/4Z+Gye3wuSPl4Y9lRE3F9juczMzHpVJ2GuB5zYnCwBIuJ2SScBX6k6sXzP5TbAkfk1nHQbybYRcWEvo2+R//4gv4pOB3avGoeZmVkVdRLmIsDjLYZPyXUqi4jngH3zq6c6G5SUqaSqmZnZgKnT6OcBZh3Zldki1zEzMxt06hxhngEcJeks4Ajg7ly+CqkT9M8AB7Y3PLP2GnXgRZ0OoZLJR2/e6RDMrEmdhHkc8EFgZ2AnoNGl3RBApB52jm9rdGZmZl2iTscFM4CdJJ0KbEPquEDA/cB5EXHZwIRoZmbWeXWOMAGIiEuBSwcgFjMzs65Vq6cfMzOz+ZUTppmZWQVOmGZmZhU4YZqZmVXghGlmZlaBE6aZmVkFlROmpEsl7SRpwYEMyMzMrBvVOcL8EHAW8Jikn0tao7cRzMzMBos6CfOdwC7AzcA3gFskXS9pb0mLDkh0ZmZmXaJywoyI1yLijxGxCbA88GPgHcBvgCmSxkj65ADFaWZm1lF9avQTEQ9FxKGk/mQ3BcaTHtp8laQ7JX1LUq1nY5qZmXWz/raSXQvYCvgUszpinwn8DLhP0if6OX0zM7OuUDthShou6euSbgImAnsBlwAbR8T7I2J1YGPgJeDEtkZrZmbWIZWfViLp08CewGhgIeAe4ABgbEQ8U6wbEVdIOhonTDMzGyTqPN7rMuBV4K/AKRFxZS/17wP+1dfAzMzMukmdhPkd4PSImFqlckSMJzUGMjMzm+fVuYa5GLB0TwMlrSbpB/0PyczMrPvUSZiHAmu2GL56rmNmZjbo1EmY6mX4QsAb/YjFzMysa7W8hilpcWB4oWikpGVKqo4gdZv3SBtjMzMz6xq9Nfr5X6BxXTKAn+dXGZFuMzEzMxt0ekuYE/JfkRLnOODWpjoBvABcFxHXtjU6MzOzLtEyYeZ7La8EkLQscHJEXD83AjMzM+smle/DjIg9BjIQMzOzbtZjwmw07omIh4vve9Oob2ZmNpi0OsKcDMyUtHBEvJbfR4VpDm1DXGZmZl2lVcI8nJQg32h6b2ZmNt/pMWFGxGGt3puZmc1P+vsAaTMzs/mCE6aZmVkFrVrJzqT+NcuIiDqPDDMzM5sntEpuZ+BGPmZmZkDrRj+7z8U4zMzMupqvYZqZmVXghGlmZlZBq0Y/DwIzgZUj4nVJD1SYXkTECm2LzszMrEu0avTzEKnRT6Phz8O4EZCZmc2nWjX62aDVezMzs/lJR69hSlpU0gmSpkh6WdJESVtVGG9dSb+TdIuk1yX5yNfMzAZU7U4GJL0V2ABYPhc9AFwZEa/0Yf7jgA8CBwAPArsD4yRtGREXtxhvoxzDTcDrwIf7MG8zM7PKaiVMSbsCPwWWAJSLA5gm6TsRMbbGtDYDNga2jYhxuWw8KREfD7RKmD+KiB/mcX6OE6aZmQ2wyqdkJe0EjAVeAA4GtgFGA4fksjG5TlWjgenA+Y2CiAjgdGBlSav2NGJEzKwxHzMzs36rc4T5f8DdwMcj4rlC+fmSTgKuJyXScypOb3XgzpLkd2txeI34eiVpWi9VhrVzfmZmNnjUafSzEnBaU7IEICKmA6cBK9aY3khgakn51MJwMzOzrlDnCPNxZl23LDMTeKLm/Fu1bm17y9eIGN5qeD4C9VGmmZnNoc4R5lhgd0mLNg+QtDjwP6SjzKqeofwockT+W3b0aWZm1hGtusZbr6noKmAL4LZ8zfJu0lHgqsBXgaeBq2vM+w5gO0lDmq5jrpH/3l5jWmZmZgOq1SnZCcx5WrRxSvaYwrBG2bLApcDQivMeB+wJbEmhpSywKzApItra4MfMzKw/WiXMPQZ43hcD40m3o4wkdVywG7AusHWjkqQJwPoRoULZksD6+e37ctn2+f3kiJg4wLGbmdl8plVfsqcP5IwjIiRtAxyZX8NJt5FsGxEX9jL6asC5TWWN96eTegwyMzNrm9pd47VTvkVl3/zqqc4GJWUTaN1i18zMrK360pfsO0hd0S1BSSvbiDijDXGZmZl1lcoJU9IQ4ERgL1rfjuKEaWZmg06d+zC/C+wDnE1qnCPgQODrwL3ARGCTdgdoZmbWDeokzN2ASyJiV+DvuezGiDgZ+BDw9vzXzMxs0KmTMJdnVqJsdDTwFoCIeJHUy89e7QvNzMyse9RJmC+THtYM6XFeASxVGP448N42xWVmZtZV6iTMh4AVACLideA+YNPC8I2p3/m6mZnZPKFOwryC9NDnhjOBz0san3vj2QH4UxtjMzMz6xp17sM8DvinpLdGxKvAUaRTsl8EZgCnAIe2P0QzM7POq5wwI2IKMKXwfgbwzfwyMzMb1OqckjUzM5tv9aVrvB1J1zKXz0UPAOMiwtcvzcxs0KrTNd7CpOdWfprUy8+0/PcjwI6S9gG2yvdkmpmZDSp1TskeCWwE/BJYOiJGRMQSwNK5bEPgiPaHaGZm1nl1TsnuBJwbEfsVCyPicWA/Se/OdfYrG9nMBsaoAy/qdAiVTD56806HYNYvdY4wFwfGtxh+Ra5jZmY26NRJmLcCK7YYviJwW//CMTMz6051EuYhwN6StmweIGlrUsfr/9euwMzMzLpJj9cwJf2upPhB4DxJk4C7SB2wrwqsRDq63IV0atbMzGxQadXoZ/cWw1bOr6I1gTWAPfsZk5mZWdfpMWFGhHsBMjMzy5wUzczMKuhL13gC1mb2rvFujohoZ2BmZmbdpFbClLQpcBKwbNOgyZK+FhGXtC0yMzOzLlKnL9lPAhcALwInALfnQauRGghdIGnDiLi23UGamZl1Wp0jzB8AjwMfy8/GfJOkY4Hrc51N2xeemZlZd6jT6OdjwCnNyRLefLj0b4GPtyswMzOzblInYS4IPN9i+HO5jpmZ2aBTJ2HeBewsaY7TuLlsp1zHzMxs0KmTMH9NOi17uaTNJS2XX1sAl+dhJw1EkGZmZp1WudFPRJwqaUXgu8C6JVWOjYgxbYvMzMysi9S6DzMividpDLA1sBwg4H7ggoi4ZwDiMzMz6wqVEqakt5JOuU7JifHYAY3KzMysy1S9hjmDdJ3ycwMYi5mZWdeqlDAj4g1SpwUa2HDMzMy6U51WsucCO0ryE07MzGy+U6fRz6nAhsClkn4O3Au81FwpIh5uU2xmZmZdo07CvB0I0mnZDVrUG9qfgMzMzLpRnYR5OClhmpmZzXfqdFxw2ADGYWZm1tWq3oe5JLA88HRE3D+wIZnZ/GzUgRd1OoRKJh+9eadDsLmsZYtXSUMknQxMAa4F7pF0TU6gZmZm843ebhHZF/gy6R7MvwK3AZ8AfjPAcZmZmXWV3hLmrqRHdq0SETtExFrAGGBLScP7O3NJi0o6QdIUSS9Lmihpq4rjriDpPEnTJT0v6WJJq/Y3JjMzszK9JcyVgLERUXxw9C9Jt468vw3zHwfsAhwCbA7cCYyTtFmrkSQtBVwNjAJ2Az4PjACulPSeNsRlZmY2m94a/SwCPNZU9lhhWJ/lpLgxsG1EjMtl40mNi44HLm4x+neBJYAPR8Rjedz/BzwIHAx8tT+xmZmZNavSzV3zvZeN9/3tV3Y0MB04/80JRwRwOrByL6dXRwOXNpJlHvcZ4EJg237GZWZmNocqt5VsJumdhfcLk5LmDpLWaqobEfGzivNeHbgzImY2ld9aHN48kqS3ASuQ+rZtdivwBUlLRcSTJeNO6yWmYdOnT2f48L5dnn3ulTf6NN7cNvzk6v1VDLZlGmzLA4NvmQbb8sC8s0yLL1TrEclvmj59OsDibQ2mC1VZO1/Ir2b7lJQFUDVhjgTKHjo9tTC8zBKko9upJcOK486RMCuK6dOnP9fHcdttWP47vZ0Tnf5qO6fWHTq4TN5GFXkbdb9+LNPiQPPBz6DTW8LccIDn36qrvd664as9bkT0u2Xv3NQ4Ip7X4p6feBt1P28ja5eWCTMirhzAeT9D+VHkiPy37AgS4FlSQuzLuGZmZn3SyWdb3gGsUvJ8zTXy39vLRoqIl4EHSNc4m60BPFV2/dLMzKw/OpkwxwHDgS2byncFJkXEHA1+msbdpNgYSdKIPK2/tjtQMzOzTibMi4HxwBhJ/yNpQ0ljgXWB/RuVJE2Q1HxN8jjSBfyLJW0taXPgIuAN4Mi5Er2Zmc1XOpYw8z2X2wB/JCW5vwNrkjoyuLCXcZ8APgU8ApwJnANMA9aLiIcHMm4zM5s/KeUt60Zu3df9vI26n7eRtUsnT8mamZnNM3yEaWZmVoGPMM3MzCpwwjQzM6vACdPMzKwCJ0wzM7MKnDC7kKRFJZ0gaYqklyVNlLRVp+OyRNJGksZKmiTpJUn/lfRXSWv0PrZ1iqTDJIWkWzodi82bnDC70zhgF+AQYHPSc0HHSdqso1FZw1eAZUiPsvsc8O38/gZJH+9kYFZO0mrA94AnOh2Lzbt8W0mXyUnxIlKPR+NymYCrgZERsUon4zMoe0C5pOHAg8AVEbFdZyKzMvkBD9cCN5Ae0DA8ItbqbFQ2L/IRZvcZTeon9/xGQe5G8HRgZUmrdiowS8qehhMR04B7gffM/YisF/9L2i4HdzoQm7c5YXaf1YE7I6L56eW3FoZbl5G0JGnblD6WzjpD0vLA4cC+EfFcp+OxeZsTZvcZSfkDsKcWhlsXyafMTyF9n47rcDiW5e3yW+CSiDiv0/HYvG+BTgdgpVpdWPZF5+5zLOnJO3tExF2dDsbetDfwYcCXMawtnDC7zzOUH0WOyH/Ljj6tQyQdAXwH+FZEjO1wOJZJejvwE+Ao4MXcKAvSb97Q/P6ViHilUzHavMenZLvPHcAquWVfUeMeP18j6xKSDgf+DzggIk7odDw2m/cAw0gJ89nC65Oka83PAod1KjibN/m2ki4jaXPgb8A2EXF+ofwqYKmIWLljwdmbJB1K+sH9fkT8uMPhWBNJi5JOxzb7ObAosBfwcEQ8MFcDs3maT8l2n4uB8cAYSSNJ9/btBqwLbN3JwCyR9B1SsvwbcFlTZwWvRsTNHQnM3hQRLwATmssLD5OeY5hZb3yE2YUkLQ4cCWwPDCf19HO4W/p1B0kTgPV7GPxQRIyae9FYHXnbueMC6xMnTDMzswrc6MfMzKwCJ0wzM7MKnDDNzMwqcMI0MzOrwAnTzMysAidMMzOzCubbhCkpJI3tdBx9IWlhSSdIeljSDEmTOx1TkaTd8/rdoNOxdNK8uB4kjcoxH9bpWBrmxfVog1NbE6akDfIHOyTt1UOdkPS3ds53PvQ94BvAOcDuwH4djWY+JmktSYdJGtXpWGzeI2l4/vxsUHO8gySdK+mB/Js6uZf6H5N0maTnJT0n6R+SSjtvkLS0pDMkPSXpZUkTJe1QJ77BaiCPMH8o6W0DOP352SbAbRGxf0Sc2YU9AJ0JvA24qtOBzAVrAYcCozocx2A2mD9Pw0mfnw1qjnck8GngflJH8j3KXTdeCSwH/CDPb0XgaklrNNUdAVwDbAv8GvgW8ALwJ0l71Ixx0BmohDkRWBof+QAgaaikhds4yXfSxY/5iogZEfFKRMxs1zQHYB32i6TFOh3D/GIgPk+DwAoRMTIiNgEe66XuCcBrwHoR8bOI+BmwHunZusc31T2QlFg/HxE/iIhTgI2AG4Djcqf286S2fGcjom0v0l5SAPuTkuY0YGRTnQD+VlI2tmR6u+dhGxTKDstlq5KePDAFeBG4HFgp19kWuAl4GZgMfLlk2gGMBTYGrgNeAh4HfgEsUlJ/GHAMcB/wKvAUcDawfA8xbwx8n7QH+Dqwey/rbgHSqdY7gVdIz8UcB6xRMu3m12EtpjuqUQfYEbglr5f7SA88BlgG+DMpCT8P/B5YrGk6KwMnkR4/9nxeXzcCe1fZbrn87cCJwCOkL/Aj+X3zZ6RP6zCPuxNpD7kR4/XA9j3UuwB4OG/Pp4HzgDVL6k4mdeS9NnAJMJ3UKf5hPWyPsU3L8Wngu3k5XgXuAXYrmc8Q4KA87VeA24BdCvMZVag7AZjcans3Tfdg0hHa43ndP0w6gmhe93OMn8s/m9fp1cAShfIVSUeAU/J0J5MeqL1I0/jvBX4HPJTXwZPAtWXroeLvQK1128vv1e7AHqTP9qs5xgN6GOfDpO/l07nupLxuFyjUWTtvv0vJ3Y/m8qF5G7wErFaYf/Nrju3ay3Lc3tM4wPvyNMeUDBsDzATeWSj7L3BfSd0v5ens2EssF5B+jxcvGfbRPI3vd+o7m4ctRPpOTcrzm0b6rh3b67qus2FqfAC/S9orCeCnTXXalTBvID3V4xvA4XkjTc4bdgrp1MO+wM25/rol87yVdLrhZ8DXgHNz+eXAkELdYcxKFL8Avkw6rfEEKXEuWxLzLXmcA/O01+ll3Z2Tx/tnXqYj8oZ8AVg711ke+GKe5135/y+WfWAK0x2VpzuR9GPZvF52If1AnAZ8hfQlCuDUpul8hfTFPCb//x3SjkYAB1XYbsNIP2YzgVPzOvltfn8XhQTdj3X44zze30lnN76ZPyMBfL2p7tXA+cAhpEc9HUXaSXkeWLHky/cA6dTXKXn7fxtYE/hNnv4Rhe2xTtNyXAf8BziAdIrr7lz+yab5/DyXX5lj/3H+DNxE/xLmQnk6Y/J2a2zn10g/FAv2Mv5upB2WvwILFco/RPoheoj0vdwb+BXpx+xa4C253gJ5mZ/Pn5898/obS9PnrIftWvZ5qrVue/m9uo60k3II6bvR+Fx/oan+ZnnZ7qiQ2xIAAAvJSURBVCDt2OyTl2EGcG5T3W/S9N0g/U4FsE9+/w7S5zTyum18frap+bvbKmF+Pk9/r5Jhe+dhm+f378rvf19Sd8U8rGVSAbYoLmPTsJPzunpvp76zeVjjN+500nfh66Tv3o29rus6G6bGB/C7+f0/SXtayxbqtCthXsjse2+ND+jzwDKF8iVzDGeXzDOaP5ykhBjAzk1lLwMfaKq7LPBcMfZCzJOAhSuut03yOOc0LdOawBvA1SUfhgkVpz0qT/vFpu3QWC8zGx+kwrC/kn5MFy2UlR11DyH9cE8n/zi22G5H5LKvNU3j67n8R/1chx/M4xxZMuy8vJ0W62V5ViH9IJ5Usr57+tGZY1lLht3M7Enp3Xk+ZxfKVsrb4nJgaNNyzaR/CVPA20rq7knTUUPz+KSdlSCdXRjSNP5/SAmq+WzE6DzO7oXPcdDDUVuFbVv2eaq8bltMd4M8jcdITzBplC9M2in9f4WyhUg7nFdROJrMw/637DNA+nF/HViH9HSbGcCfe9tefVg/rRLmd/L0P1cybLM87Mv5/Yfy+2NK6i6ch53VSyxDSUeA/y4ZfzpwcRd8Z6cW46jzGujbSr4HLAj8aACmfULkpc+uzn/Pj4iHG4UR8RTph3fFkmlMijkbzByd/44GkCTSUdhVwKOS3t54kZLQdcBnSqb964h4qeKyjM5/jyguU0TcSnrm4rqSlqw4rZ6cFxEPFabdWC8zSadFi64G3kKhIUtEvNj4X9JC+VmdI0g7RYuTTtm2Mpr0I3RKU/lvSKdWRs8xRr11uAt5r7G4jfJ2ugBYjPTDNdvyKFk812usk4+VTH8q6Si8L06KiNcK836UdLRd/ExuTUpsP42IGYW6N5FO7fVZJC/Dm9eCh+flvSJXKVveIZJ+RdqL/35EfC0K1xBzY5E1gbOAtzat72tI343G92J6/ruhpKX6sywlqqzb3pwWEdMK03iJ9L0uTmMT0hHhacDwpuW9ONdp/h3Yg5RkzwL+QLoEUXr3wABqXPd/tWTYK0116tQtlT+7vwM+0tSgaHvS78SYQlmnvrPTgdUkrd5qWcoMaMKM9CDds4FdJK3Z5sk3Pym90VLswZK6zwIjS8rvai6IiCmk01fL56Il87ifIW2c5lfji9Tsntbhz2Y5Zp2abHZ7oU5/lD1Z/llgSkQ0f0Ea6/LNdSZpUUnHSXqYdLT9NGn5j8hVluhl/suRdlDeKBbm95OYtb6L6qzDVUgJ527m3EaNL+mb20nS2vn2pudJX6BG3TV6WJb7i4msprJ1/wyzfyYby393Sd07+zjfN0naUdL1pG33LGlZG3GVLe9+pKP/gyPixyXDV8l/f8ic6/tJYBHy+s47akeQvkNTJN0o6SeSPtLf5aLaum3HNBrL+zvmXN7GNpvtdyAippKS5ijSke8uxcQ8lzR2ON9aMmyhpjp16rYyhnQ0vWehbE/S5+KCQlmnvrP75fq3Sbpf0qmStpbUaz5coLcKbXAIae/iGOBzNcdtFV9PP149laukLErKmus2/r+MtAxVVT0yap7fQKm7vmD2uM4iXZ84hXS0PZV0ungz0impgdj5qrsOg/QZ62mZ7gCQtAxpGZ4jnf2YRDoiCtK1jLKWgHViaVblM9n4v+wzWeezO8d3RtK2pNP9/yZd53uEdMQwFPgH5dvuUlJLyn0k/TEimpNKI6bj8zTKvHm7Q0QcIul3wObAp0hHWvtL+klEfK+H8auo832vO42y6e1PurZepqy16laF/9cC/lUjrnZoxPTukmGNskf7ULdHEfGIpH8AX5R0AKlR4XrAcRHxeqFqR76zEXF+vm96M9Kp8o1JCf1qSRsXz1g0G/CEGREPSvo18C1JG/ZQbSrp9F6zsqOOdlq1uUDSu0gNVBo/EE+RjjgXj4jLBiiO+0mtEFchNUQqi7HsyHmukDSclCzPjIivNA3buOJkHgBWkrRA8ShT0gLA+ynfy6/jXmBT4OGIKDtSLxpN+oJtFRHjiwPyqeayU1I96Slx1XV//rsKc66LVZjTVNI1p2Zl35kvkRLkhsVT3JJanUa/jdRA7HLgSkmfjoh7C8Mb/8+o+r3ISfeXwC8lLURqvXiApOMj4skq0+igxvK+WHV5JW1JasB3Gqm16nGSroqI2wrV2vX56ckN+e86pMZ2RR/P878R0tk1SY/m8maNsokV53sKaedoG1JLVZj9dCx07jvbOPr/PfD7fNntaFLDsa1JjT9Lza2u8X5M2jPo6QjtHmCd4n12kpYgnc4YSCtJ2qaprLG3ex5Avm7zB+CjkrYvm0gbrss0rqMelDdeY7qrk/ZQr8nXHDulsfc321573rmoek3mPNLp7eb6e+fycf0JkHRrA8CRkoY2D2zaRj0tz96ke1zreCH/Ldvhq+MC0o/Xt4vxS/ogaQ+42T3AYpI+Wqg7hHS032xGnvaQQl2Rzv70KCLuIDWMGUpKmsUEezPpcsFXJM2RpCUtkG+CR9IwSW9pmvYrzLoE0dvp/G5wCemU4oGN5SqS9LbifX6S3k1KlHeRWt7uQjri+aNm79ClXZ+fUhFxHynJ7SBp6UJ8SwM7AFdExOOFUc4GVsjJvlF3KCnxT2PW9dreXEQ6Gt2H1Mr6XxHRfLlhrn9nG9fwi2W53cjN+W3L7TA3TskSEU9LOpaeG//8ipTtr5B0Jqn3i71JzdXr/oDVcRtpD+O3pL2dDUmnj68kncJqOBj4JKm3iz+RGgS8RmoluxlpD233vgYREZfm6e4MLJHP07+TdA3pFVIL4I6JiOcl/ZN0iuVl0l7rsqQvw4NUu170E9IX9MScBG4m7XnuSTq98pN+xniDpENJ19RukXQu6RTTu0hHYpuRGqBBasL+EnBmbtjyLGn7bkY60qvzvbiBdP354LyT9yLpfq/ra8Z/t6QTST+uV0j6C7BUfv8fZu2lN5xCagE5TtIvSJ/H7XuI/c/Adnm6Z5AadG1DLw04CnGtT2ogNEHSRhFxR0SEpC/l8lvz6dY78jTfR7oX+iDSbRcbAqfkZZpEShIfIu08XR8Rk6qso06KiBcl7Ura8ZuUl/c+0m/VyqTlHU1aR0NIO9mLkFrbvwS8JGlP0o7hz0nfHSLiGUn3ATtLup90q9qLEXFhq3jyul82v10SWFBSYwfooYg4s1D9W6RbNa6W9Mtc9g3SDtR3miZ9NOl7epakn5KS3ueBj5BanD5fYXURETMkncasnbL/K6nTie/sYqTr6BeQfoOeJLWv+GqeZsv13qdmzD29aLqtpGnYwnllBE23leTh+zPrpua7gP+h9W0lo5rGH0UPzbMpaYKf644l7b03GkM8QTpltFgP8X+flGRfJl14vot0L+HHCvXmiLniumt0XHBXXgdTSV/ONUrqTqb+bSWV1ktPy0DqdODUvA0bN9Xv3UPd0nVA+mKfRLo5+vX890Tg7b3Nv8Z63Jx0NDA1r8dHSF+2rzbVW49ZN0tPI+0Rr97DZ6Xl+ibtQd9JSlrBnB0XzLEcPcyn0cFA43twOz10XJDrb0a6nvZq3i7HkG5PmWN7523V6BRjCinhjijG2+rzQjrVO5n0A7NmoXxZ0v11k/PyP0PagTyKfL8d6QfpZNJn+znSTsVdpPsSh1XYppU/Y60+1yX1NqBw+0vTsLHkg4+m8tVJO/eP5uV9gnTP6feBEbnOD/J0v1oy/ol52HaFso+Srm02rslViX0Cs26Na37N8VklnZK9nLSz8jzpO/LBHqb9btLR39P583ITsFMfvovLko4Mn6PklpBOfGdJCfgo0vX8Z/L8JpMac63Y2zIpT8SsbfKe9KnApyLimk7HM69TenLIocByETG5s9GYVZMv2TxC6mVon07H0w7z7eO9bEA1rpV0e0MOMxs4XyVd/26+93qeNVeuYdr8ITf+2IL0RXmQWS0LzWw+IWln0q0k+wOXRMSNHQ6pbZwwrZ3WIz126CZSP5A+3282/zmbdO3zambvvGCe52uYZmZmFfgappmZWQVOmGZmZhU4YZqZmVXghGlmZlaBE6aZmVkF/x9M7uhhVP6WfAAAAABJRU5ErkJggg==\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "X = st.poisson(1.0)\n",
    "ks = range(6)\n",
    "fig, ax = plt.subplots()\n",
    "ax.bar(ks, X.pmf(ks))\n",
    "ax.set_xlabel('Number of major earthquakes in next 100 years')\n",
    "ax.set_ylabel('Probability of occurance')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "JLB1_h_RTVg3"
   },
   "source": [
    "### Questions\n",
    "\n",
    "+ How would the rate parameter $\\lambda$ change if the rate with each major earthquakes occured in the past was 2 every 100 years? Plot the pmf of the new Poisson random variable. You may have to add more points in the x-axis."
   ]
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "colab": {
   "name": "lecture_04.ipynb",
   "provenance": []
  },
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.6"
  },
  "latex_envs": {
   "bibliofile": "biblio.bib",
   "cite_by": "apalike",
   "current_citInitial": 1,
   "eqLabelWithNumbers": true,
   "eqNumInitial": 0
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
